Newest ML-Lib on Spark 1.1

2014-12-12 Thread Ganelin, Ilya
Hi all – we’re running CDH 5.2 and would be interested in having the latest and 
greatest ML Lib version on our cluster (with YARN). Could anyone help me out in 
terms of figuring out what build profiles to use to get this to play well? Will 
I be able to update ML-Lib independently of updating the rest of spark to 1.2 
and beyond? I ran into numerous issues trying to build 1.2 against CDH’s Hadoop 
deployment. Alternately, if anyone has managed to get the trunk successfully 
built and tested against Cloudera’s YARN and Hadoop for 5.2 I would love some 
help. Thanks!


The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


Re: Newest ML-Lib on Spark 1.1

2014-12-12 Thread Debasish Das
For CDH this works well for me...tested till 5.1...

./make-distribution -Dhadoop.version=2.3.0-cdh5.1.0 -Phadoop-2.3 -Pyarn
-Phive -DskipTests

To build with hive thriftserver support for spark-sql

On Fri, Dec 12, 2014 at 1:41 PM, Ganelin, Ilya ilya.gane...@capitalone.com
wrote:

 Hi all – we’re running CDH 5.2 and would be interested in having the
 latest and greatest ML Lib version on our cluster (with YARN). Could anyone
 help me out in terms of figuring out what build profiles to use to get this
 to play well? Will I be able to update ML-Lib independently of updating the
 rest of spark to 1.2 and beyond? I ran into numerous issues trying to build
 1.2 against CDH’s Hadoop deployment. Alternately, if anyone has managed to
 get the trunk successfully built and tested against Cloudera’s YARN and
 Hadoop for 5.2 I would love some help. Thanks!
 

 The information contained in this e-mail is confidential and/or
 proprietary to Capital One and/or its affiliates. The information
 transmitted herewith is intended only for use by the individual or entity
 to which it is addressed.  If the reader of this message is not the
 intended recipient, you are hereby notified that any review,
 retransmission, dissemination, distribution, copying or other use of, or
 taking of any action in reliance upon this information is strictly
 prohibited. If you have received this communication in error, please
 contact the sender and delete the material from your computer.



Re: Newest ML-Lib on Spark 1.1

2014-12-12 Thread Sean Owen
Could you specify what problems you're seeing? there is nothing
special about the CDH distribution at all.

The latest and greatest is 1.1, and that is what is in CDH 5.2. You
can certainly compile even master for CDH and get it to work though.

The safest build flags should be -Phadoop-2.4 -Dhadoop.version=2.5.0-cdh5.2.1.

5.3 is just around the corner, and includes 1.2, which is also just
around the corner.

On Fri, Dec 12, 2014 at 9:41 PM, Ganelin, Ilya
ilya.gane...@capitalone.com wrote:
 Hi all – we’re running CDH 5.2 and would be interested in having the latest 
 and greatest ML Lib version on our cluster (with YARN). Could anyone help me 
 out in terms of figuring out what build profiles to use to get this to play 
 well? Will I be able to update ML-Lib independently of updating the rest of 
 spark to 1.2 and beyond? I ran into numerous issues trying to build 1.2 
 against CDH’s Hadoop deployment. Alternately, if anyone has managed to get 
 the trunk successfully built and tested against Cloudera’s YARN and Hadoop 
 for 5.2 I would love some help. Thanks!
 

 The information contained in this e-mail is confidential and/or proprietary 
 to Capital One and/or its affiliates. The information transmitted herewith is 
 intended only for use by the individual or entity to which it is addressed.  
 If the reader of this message is not the intended recipient, you are hereby 
 notified that any review, retransmission, dissemination, distribution, 
 copying or other use of, or taking of any action in reliance upon this 
 information is strictly prohibited. If you have received this communication 
 in error, please contact the sender and delete the material from your 
 computer.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



RE: Newest ML-Lib on Spark 1.1

2014-12-12 Thread Ganelin, Ilya
Hi Sean - I should clarify : I was able to build the master but when running I 
hit really random looking protobuf errors (just starting up a spark shell), I 
can try doing a build later today and give the exact stack trace.

I know that 5.2 is running 1.1 but I believe the latest and greatest Ml Lib is 
much fresher than the one in 1.1 and specifically includes fixed for ALS to 
help it scale better.

I had built with the exact flags you suggested below. After doing so I tried to 
run the test suite and run a spark she'll without success. Might you have any 
other suggestions? Thanks!



Sent with Good (www.good.com)


-Original Message-
From: Sean Owen [so...@cloudera.commailto:so...@cloudera.com]
Sent: Friday, December 12, 2014 04:54 PM Eastern Standard Time
To: Ganelin, Ilya
Cc: dev
Subject: Re: Newest ML-Lib on Spark 1.1


Could you specify what problems you're seeing? there is nothing
special about the CDH distribution at all.

The latest and greatest is 1.1, and that is what is in CDH 5.2. You
can certainly compile even master for CDH and get it to work though.

The safest build flags should be -Phadoop-2.4 -Dhadoop.version=2.5.0-cdh5.2.1.

5.3 is just around the corner, and includes 1.2, which is also just
around the corner.

On Fri, Dec 12, 2014 at 9:41 PM, Ganelin, Ilya
ilya.gane...@capitalone.com wrote:
 Hi all – we’re running CDH 5.2 and would be interested in having the latest 
 and greatest ML Lib version on our cluster (with YARN). Could anyone help me 
 out in terms of figuring out what build profiles to use to get this to play 
 well? Will I be able to update ML-Lib independently of updating the rest of 
 spark to 1.2 and beyond? I ran into numerous issues trying to build 1.2 
 against CDH’s Hadoop deployment. Alternately, if anyone has managed to get 
 the trunk successfully built and tested against Cloudera’s YARN and Hadoop 
 for 5.2 I would love some help. Thanks!
 

 The information contained in this e-mail is confidential and/or proprietary 
 to Capital One and/or its affiliates. The information transmitted herewith is 
 intended only for use by the individual or entity to which it is addressed.  
 If the reader of this message is not the intended recipient, you are hereby 
 notified that any review, retransmission, dissemination, distribution, 
 copying or other use of, or taking of any action in reliance upon this 
 information is strictly prohibited. If you have received this communication 
 in error, please contact the sender and delete the material from your 
 computer.


The information contained in this e-mail is confidential and/or proprietary to 
Capital One and/or its affiliates. The information transmitted herewith is 
intended only for use by the individual or entity to which it is addressed.  If 
the reader of this message is not the intended recipient, you are hereby 
notified that any review, retransmission, dissemination, distribution, copying 
or other use of, or taking of any action in reliance upon this information is 
strictly prohibited. If you have received this communication in error, please 
contact the sender and delete the material from your computer.


Re: Newest ML-Lib on Spark 1.1

2014-12-12 Thread Sean Owen
What errors do you see? protobuf errors usually mean you didn't build
for the right version of Hadoop, but if you are using -Phadoop-2.3 or
better -Phadoop-2.4 that should be fine. Yes, a stack trace would be
good. I'm still not sure what error you are seeing.

On Fri, Dec 12, 2014 at 10:32 PM, Ganelin, Ilya
ilya.gane...@capitalone.com wrote:
 Hi Sean - I should clarify : I was able to build the master but when running
 I hit really random looking protobuf errors (just starting up a spark
 shell), I can try doing a build later today and give the exact stack trace.

 I know that 5.2 is running 1.1 but I believe the latest and greatest Ml Lib
 is much fresher than the one in 1.1 and specifically includes fixed for ALS
 to help it scale better.

 I had built with the exact flags you suggested below. After doing so I tried
 to run the test suite and run a spark she'll without success. Might you have
 any other suggestions? Thanks!



 Sent with Good (www.good.com)



 -Original Message-
 From: Sean Owen [so...@cloudera.com]
 Sent: Friday, December 12, 2014 04:54 PM Eastern Standard Time
 To: Ganelin, Ilya
 Cc: dev
 Subject: Re: Newest ML-Lib on Spark 1.1

 Could you specify what problems you're seeing? there is nothing
 special about the CDH distribution at all.

 The latest and greatest is 1.1, and that is what is in CDH 5.2. You
 can certainly compile even master for CDH and get it to work though.

 The safest build flags should be -Phadoop-2.4
 -Dhadoop.version=2.5.0-cdh5.2.1.

 5.3 is just around the corner, and includes 1.2, which is also just
 around the corner.

 On Fri, Dec 12, 2014 at 9:41 PM, Ganelin, Ilya
 ilya.gane...@capitalone.com wrote:
 Hi all – we’re running CDH 5.2 and would be interested in having the
 latest and greatest ML Lib version on our cluster (with YARN). Could anyone
 help me out in terms of figuring out what build profiles to use to get this
 to play well? Will I be able to update ML-Lib independently of updating the
 rest of spark to 1.2 and beyond? I ran into numerous issues trying to build
 1.2 against CDH’s Hadoop deployment. Alternately, if anyone has managed to
 get the trunk successfully built and tested against Cloudera’s YARN and
 Hadoop for 5.2 I would love some help. Thanks!
 

 The information contained in this e-mail is confidential and/or
 proprietary to Capital One and/or its affiliates. The information
 transmitted herewith is intended only for use by the individual or entity to
 which it is addressed.  If the reader of this message is not the intended
 recipient, you are hereby notified that any review, retransmission,
 dissemination, distribution, copying or other use of, or taking of any
 action in reliance upon this information is strictly prohibited. If you have
 received this communication in error, please contact the sender and delete
 the material from your computer.


 

 The information contained in this e-mail is confidential and/or proprietary
 to Capital One and/or its affiliates. The information transmitted herewith
 is intended only for use by the individual or entity to which it is
 addressed.  If the reader of this message is not the intended recipient, you
 are hereby notified that any review, retransmission, dissemination,
 distribution, copying or other use of, or taking of any action in reliance
 upon this information is strictly prohibited. If you have received this
 communication in error, please contact the sender and delete the material
 from your computer.

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: Newest ML-Lib on Spark 1.1

2014-12-12 Thread Debasish Das
protobuf comes from missing -Phadoop2.3

On Fri, Dec 12, 2014 at 2:34 PM, Sean Owen so...@cloudera.com wrote:

 What errors do you see? protobuf errors usually mean you didn't build
 for the right version of Hadoop, but if you are using -Phadoop-2.3 or
 better -Phadoop-2.4 that should be fine. Yes, a stack trace would be
 good. I'm still not sure what error you are seeing.

 On Fri, Dec 12, 2014 at 10:32 PM, Ganelin, Ilya
 ilya.gane...@capitalone.com wrote:
  Hi Sean - I should clarify : I was able to build the master but when
 running
  I hit really random looking protobuf errors (just starting up a spark
  shell), I can try doing a build later today and give the exact stack
 trace.
 
  I know that 5.2 is running 1.1 but I believe the latest and greatest Ml
 Lib
  is much fresher than the one in 1.1 and specifically includes fixed for
 ALS
  to help it scale better.
 
  I had built with the exact flags you suggested below. After doing so I
 tried
  to run the test suite and run a spark she'll without success. Might you
 have
  any other suggestions? Thanks!
 
 
 
  Sent with Good (www.good.com)
 
 
 
  -Original Message-
  From: Sean Owen [so...@cloudera.com]
  Sent: Friday, December 12, 2014 04:54 PM Eastern Standard Time
  To: Ganelin, Ilya
  Cc: dev
  Subject: Re: Newest ML-Lib on Spark 1.1
 
  Could you specify what problems you're seeing? there is nothing
  special about the CDH distribution at all.
 
  The latest and greatest is 1.1, and that is what is in CDH 5.2. You
  can certainly compile even master for CDH and get it to work though.
 
  The safest build flags should be -Phadoop-2.4
  -Dhadoop.version=2.5.0-cdh5.2.1.
 
  5.3 is just around the corner, and includes 1.2, which is also just
  around the corner.
 
  On Fri, Dec 12, 2014 at 9:41 PM, Ganelin, Ilya
  ilya.gane...@capitalone.com wrote:
  Hi all – we’re running CDH 5.2 and would be interested in having the
  latest and greatest ML Lib version on our cluster (with YARN). Could
 anyone
  help me out in terms of figuring out what build profiles to use to get
 this
  to play well? Will I be able to update ML-Lib independently of updating
 the
  rest of spark to 1.2 and beyond? I ran into numerous issues trying to
 build
  1.2 against CDH’s Hadoop deployment. Alternately, if anyone has managed
 to
  get the trunk successfully built and tested against Cloudera’s YARN and
  Hadoop for 5.2 I would love some help. Thanks!
  
 
  The information contained in this e-mail is confidential and/or
  proprietary to Capital One and/or its affiliates. The information
  transmitted herewith is intended only for use by the individual or
 entity to
  which it is addressed.  If the reader of this message is not the
 intended
  recipient, you are hereby notified that any review, retransmission,
  dissemination, distribution, copying or other use of, or taking of any
  action in reliance upon this information is strictly prohibited. If you
 have
  received this communication in error, please contact the sender and
 delete
  the material from your computer.
 
 
  
 
  The information contained in this e-mail is confidential and/or
 proprietary
  to Capital One and/or its affiliates. The information transmitted
 herewith
  is intended only for use by the individual or entity to which it is
  addressed.  If the reader of this message is not the intended recipient,
 you
  are hereby notified that any review, retransmission, dissemination,
  distribution, copying or other use of, or taking of any action in
 reliance
  upon this information is strictly prohibited. If you have received this
  communication in error, please contact the sender and delete the material
  from your computer.

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org