Newest ML-Lib on Spark 1.1
Hi all – we’re running CDH 5.2 and would be interested in having the latest and greatest ML Lib version on our cluster (with YARN). Could anyone help me out in terms of figuring out what build profiles to use to get this to play well? Will I be able to update ML-Lib independently of updating the rest of spark to 1.2 and beyond? I ran into numerous issues trying to build 1.2 against CDH’s Hadoop deployment. Alternately, if anyone has managed to get the trunk successfully built and tested against Cloudera’s YARN and Hadoop for 5.2 I would love some help. Thanks! The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
Re: Newest ML-Lib on Spark 1.1
For CDH this works well for me...tested till 5.1... ./make-distribution -Dhadoop.version=2.3.0-cdh5.1.0 -Phadoop-2.3 -Pyarn -Phive -DskipTests To build with hive thriftserver support for spark-sql On Fri, Dec 12, 2014 at 1:41 PM, Ganelin, Ilya ilya.gane...@capitalone.com wrote: Hi all – we’re running CDH 5.2 and would be interested in having the latest and greatest ML Lib version on our cluster (with YARN). Could anyone help me out in terms of figuring out what build profiles to use to get this to play well? Will I be able to update ML-Lib independently of updating the rest of spark to 1.2 and beyond? I ran into numerous issues trying to build 1.2 against CDH’s Hadoop deployment. Alternately, if anyone has managed to get the trunk successfully built and tested against Cloudera’s YARN and Hadoop for 5.2 I would love some help. Thanks! The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
Re: Newest ML-Lib on Spark 1.1
Could you specify what problems you're seeing? there is nothing special about the CDH distribution at all. The latest and greatest is 1.1, and that is what is in CDH 5.2. You can certainly compile even master for CDH and get it to work though. The safest build flags should be -Phadoop-2.4 -Dhadoop.version=2.5.0-cdh5.2.1. 5.3 is just around the corner, and includes 1.2, which is also just around the corner. On Fri, Dec 12, 2014 at 9:41 PM, Ganelin, Ilya ilya.gane...@capitalone.com wrote: Hi all – we’re running CDH 5.2 and would be interested in having the latest and greatest ML Lib version on our cluster (with YARN). Could anyone help me out in terms of figuring out what build profiles to use to get this to play well? Will I be able to update ML-Lib independently of updating the rest of spark to 1.2 and beyond? I ran into numerous issues trying to build 1.2 against CDH’s Hadoop deployment. Alternately, if anyone has managed to get the trunk successfully built and tested against Cloudera’s YARN and Hadoop for 5.2 I would love some help. Thanks! The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
RE: Newest ML-Lib on Spark 1.1
Hi Sean - I should clarify : I was able to build the master but when running I hit really random looking protobuf errors (just starting up a spark shell), I can try doing a build later today and give the exact stack trace. I know that 5.2 is running 1.1 but I believe the latest and greatest Ml Lib is much fresher than the one in 1.1 and specifically includes fixed for ALS to help it scale better. I had built with the exact flags you suggested below. After doing so I tried to run the test suite and run a spark she'll without success. Might you have any other suggestions? Thanks! Sent with Good (www.good.com) -Original Message- From: Sean Owen [so...@cloudera.commailto:so...@cloudera.com] Sent: Friday, December 12, 2014 04:54 PM Eastern Standard Time To: Ganelin, Ilya Cc: dev Subject: Re: Newest ML-Lib on Spark 1.1 Could you specify what problems you're seeing? there is nothing special about the CDH distribution at all. The latest and greatest is 1.1, and that is what is in CDH 5.2. You can certainly compile even master for CDH and get it to work though. The safest build flags should be -Phadoop-2.4 -Dhadoop.version=2.5.0-cdh5.2.1. 5.3 is just around the corner, and includes 1.2, which is also just around the corner. On Fri, Dec 12, 2014 at 9:41 PM, Ganelin, Ilya ilya.gane...@capitalone.com wrote: Hi all – we’re running CDH 5.2 and would be interested in having the latest and greatest ML Lib version on our cluster (with YARN). Could anyone help me out in terms of figuring out what build profiles to use to get this to play well? Will I be able to update ML-Lib independently of updating the rest of spark to 1.2 and beyond? I ran into numerous issues trying to build 1.2 against CDH’s Hadoop deployment. Alternately, if anyone has managed to get the trunk successfully built and tested against Cloudera’s YARN and Hadoop for 5.2 I would love some help. Thanks! The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.
Re: Newest ML-Lib on Spark 1.1
What errors do you see? protobuf errors usually mean you didn't build for the right version of Hadoop, but if you are using -Phadoop-2.3 or better -Phadoop-2.4 that should be fine. Yes, a stack trace would be good. I'm still not sure what error you are seeing. On Fri, Dec 12, 2014 at 10:32 PM, Ganelin, Ilya ilya.gane...@capitalone.com wrote: Hi Sean - I should clarify : I was able to build the master but when running I hit really random looking protobuf errors (just starting up a spark shell), I can try doing a build later today and give the exact stack trace. I know that 5.2 is running 1.1 but I believe the latest and greatest Ml Lib is much fresher than the one in 1.1 and specifically includes fixed for ALS to help it scale better. I had built with the exact flags you suggested below. After doing so I tried to run the test suite and run a spark she'll without success. Might you have any other suggestions? Thanks! Sent with Good (www.good.com) -Original Message- From: Sean Owen [so...@cloudera.com] Sent: Friday, December 12, 2014 04:54 PM Eastern Standard Time To: Ganelin, Ilya Cc: dev Subject: Re: Newest ML-Lib on Spark 1.1 Could you specify what problems you're seeing? there is nothing special about the CDH distribution at all. The latest and greatest is 1.1, and that is what is in CDH 5.2. You can certainly compile even master for CDH and get it to work though. The safest build flags should be -Phadoop-2.4 -Dhadoop.version=2.5.0-cdh5.2.1. 5.3 is just around the corner, and includes 1.2, which is also just around the corner. On Fri, Dec 12, 2014 at 9:41 PM, Ganelin, Ilya ilya.gane...@capitalone.com wrote: Hi all – we’re running CDH 5.2 and would be interested in having the latest and greatest ML Lib version on our cluster (with YARN). Could anyone help me out in terms of figuring out what build profiles to use to get this to play well? Will I be able to update ML-Lib independently of updating the rest of spark to 1.2 and beyond? I ran into numerous issues trying to build 1.2 against CDH’s Hadoop deployment. Alternately, if anyone has managed to get the trunk successfully built and tested against Cloudera’s YARN and Hadoop for 5.2 I would love some help. Thanks! The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Newest ML-Lib on Spark 1.1
protobuf comes from missing -Phadoop2.3 On Fri, Dec 12, 2014 at 2:34 PM, Sean Owen so...@cloudera.com wrote: What errors do you see? protobuf errors usually mean you didn't build for the right version of Hadoop, but if you are using -Phadoop-2.3 or better -Phadoop-2.4 that should be fine. Yes, a stack trace would be good. I'm still not sure what error you are seeing. On Fri, Dec 12, 2014 at 10:32 PM, Ganelin, Ilya ilya.gane...@capitalone.com wrote: Hi Sean - I should clarify : I was able to build the master but when running I hit really random looking protobuf errors (just starting up a spark shell), I can try doing a build later today and give the exact stack trace. I know that 5.2 is running 1.1 but I believe the latest and greatest Ml Lib is much fresher than the one in 1.1 and specifically includes fixed for ALS to help it scale better. I had built with the exact flags you suggested below. After doing so I tried to run the test suite and run a spark she'll without success. Might you have any other suggestions? Thanks! Sent with Good (www.good.com) -Original Message- From: Sean Owen [so...@cloudera.com] Sent: Friday, December 12, 2014 04:54 PM Eastern Standard Time To: Ganelin, Ilya Cc: dev Subject: Re: Newest ML-Lib on Spark 1.1 Could you specify what problems you're seeing? there is nothing special about the CDH distribution at all. The latest and greatest is 1.1, and that is what is in CDH 5.2. You can certainly compile even master for CDH and get it to work though. The safest build flags should be -Phadoop-2.4 -Dhadoop.version=2.5.0-cdh5.2.1. 5.3 is just around the corner, and includes 1.2, which is also just around the corner. On Fri, Dec 12, 2014 at 9:41 PM, Ganelin, Ilya ilya.gane...@capitalone.com wrote: Hi all – we’re running CDH 5.2 and would be interested in having the latest and greatest ML Lib version on our cluster (with YARN). Could anyone help me out in terms of figuring out what build profiles to use to get this to play well? Will I be able to update ML-Lib independently of updating the rest of spark to 1.2 and beyond? I ran into numerous issues trying to build 1.2 against CDH’s Hadoop deployment. Alternately, if anyone has managed to get the trunk successfully built and tested against Cloudera’s YARN and Hadoop for 5.2 I would love some help. Thanks! The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer. - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org