Re: [openstack-dev] [sahara] About Sahara Oozie plan and Spark CDH Issues

2015-01-28 Thread Daniele Venzano
Hello everyone,

there is already some code in our repository:
https://github.com/bigfootproject/savanna-image-elements

I did the necessary changes to have the Spark element use the cdh5
element. I updated also to Spark 1.2. The old cloudera HDFS-only
element is still needed for generating cdh4 images (but probably cdh4
support can be thrown away).

Unfortunately I do not have the time to do the necessary
testing/validation and submit for review. I also changed the CDH
element so that it can install only HDFS, if so required.
The changes I made are simple and all contained in the last commit on
the master branch of that repo.

The image generated with this code runs in Sahara without any further
changes. Feel free to take the code, clean it up and submit for review.

Dan

On Wed, Jan 28, 2015 at 10:43:30AM -0500, Trevor McKay wrote:
 Intel folks,
 
 Belated welcome to Sahara!  Thank you for your recent commits.
 
 Moving this thread to openstack-dev so others may contribute, cc'ing
 Daniele and Pietro who pioneered the Spark plugin.
 
 I'll respond with another email about Oozie work, but I want to
 address the Spark/Swift issue in CDH since I have been working
 on it and there is a task which still needs to be done -- that
 is to upgrade the CDH version in the spark image and see if
 the situation improves (see below)
 
 Relevant reviews are here:
 
 https://review.openstack.org/146659
 https://review.openstack.org/147955
 https://review.openstack.org/147985
 https://review.openstack.org/146659
 
 In the first review, you can see that we set an extra driver
 classpath to pull in '/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar.
 
 This is because the spark-assembly JAR in CDH4 contains classes from
 jackson-mapper-asl-1.8.8 and jackson-core-asl-1.9.x. When the
 hadoop-swift.jar dereferences a Swift path, it calls into code
 from jackson-mapper-asl-1.8.8 which uses JsonClass.  But JsonClass
 was removed in jackson-core-asl-1.9.x, so there is an exception.
 
 Therefore, we need to use the classpath to either upgrade the version of
 jackson-mapper-asl to 1.9.x or downgrade the version of jackson-core-asl
 to 1.8.8 (both work in my testing).  However, the first of these options
 requires us to bundle an extra jar.  Since /usr/lib/hadoop already
 contains jackson-core-asl-1.8.8, it is easier to just add that to the
 classpath and downgrade the jackson version.
 
 Note, there are some references to this problem on the spark mailing list,
 we are not the only ones to encounter it.
 
 However, I am not completely comfortable with mixing versions and
 patching the classpath this way.  It looks to me like the Spark assembly
 used in CDH5 has consistent versions, and I would like to try updating
 the CDH version in sahara-image-elments to CDH5 for Spark. If this fixes
 the problem and removes the need for the extra classpath, that would be
 great.
 
 Would someone like to take on this change? (modifying sahara-image-elements
 to use CDH5 for Spark images) I can make a blueprint for
 it.
 
 More to come about Oozie topics.
 
 Best regards,
 
 Trevor
 
 On Thu, 2015-01-15 at 15:34 +, Chen, Weiting wrote:
  Hi Mckay.
  
   
  
  We are Intel team and contributing OpenStack Sahara project.
  
  We are new in Sahara and would like to do more contributions in this
  project.
  
  So far, we are focusing on Sahara CDH Plugin.
  
  So if there is any issues related on this, please feel free to discuss
  with us.
  
   
  
  During IRC meeting, there are two issues you mentioned and we would
  like to discuss with you.
  
  1.  Oozie Workflow Support: 
  
  Do you have any plan could share with us about your idea?
  
  Because in our case, we are testing to run a java action job with
  HBase library support and also facing some problems about Oozie
  support.
  
  So it should be good to share the experience with each other.
  
  
  
  2.  Spark CDH Issues: 
  
  Could you provide more information about this issue? In CDH Plugin, we
  have used CDH 5 to finish swift test. So it should be fine to upgrade
  CDH 4 to 5.
  
   
  
  
 
 
 

-- 
Daniele Venzano
http://www.brownhat.org


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sahara] About Sahara Oozie plan and Spark CDH Issues

2015-01-28 Thread Trevor McKay
Daniele,

  Excellent! I'll have to keep a closer eye on bigfoot activity :) I'll
pursue this.

Best,

Trevor

On Wed, 2015-01-28 at 17:40 +0100, Daniele Venzano wrote:
 Hello everyone,
 
 there is already some code in our repository:
 https://github.com/bigfootproject/savanna-image-elements
 
 I did the necessary changes to have the Spark element use the cdh5
 element. I updated also to Spark 1.2. The old cloudera HDFS-only
 element is still needed for generating cdh4 images (but probably cdh4
 support can be thrown away).
 
 Unfortunately I do not have the time to do the necessary
 testing/validation and submit for review. I also changed the CDH
 element so that it can install only HDFS, if so required.
 The changes I made are simple and all contained in the last commit on
 the master branch of that repo.
 
 The image generated with this code runs in Sahara without any further
 changes. Feel free to take the code, clean it up and submit for review.
 
 Dan
 
 On Wed, Jan 28, 2015 at 10:43:30AM -0500, Trevor McKay wrote:
  Intel folks,
  
  Belated welcome to Sahara!  Thank you for your recent commits.
  
  Moving this thread to openstack-dev so others may contribute, cc'ing
  Daniele and Pietro who pioneered the Spark plugin.
  
  I'll respond with another email about Oozie work, but I want to
  address the Spark/Swift issue in CDH since I have been working
  on it and there is a task which still needs to be done -- that
  is to upgrade the CDH version in the spark image and see if
  the situation improves (see below)
  
  Relevant reviews are here:
  
  https://review.openstack.org/146659
  https://review.openstack.org/147955
  https://review.openstack.org/147985
  https://review.openstack.org/146659
  
  In the first review, you can see that we set an extra driver
  classpath to pull in '/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar.
  
  This is because the spark-assembly JAR in CDH4 contains classes from
  jackson-mapper-asl-1.8.8 and jackson-core-asl-1.9.x. When the
  hadoop-swift.jar dereferences a Swift path, it calls into code
  from jackson-mapper-asl-1.8.8 which uses JsonClass.  But JsonClass
  was removed in jackson-core-asl-1.9.x, so there is an exception.
  
  Therefore, we need to use the classpath to either upgrade the version of
  jackson-mapper-asl to 1.9.x or downgrade the version of jackson-core-asl
  to 1.8.8 (both work in my testing).  However, the first of these options
  requires us to bundle an extra jar.  Since /usr/lib/hadoop already
  contains jackson-core-asl-1.8.8, it is easier to just add that to the
  classpath and downgrade the jackson version.
  
  Note, there are some references to this problem on the spark mailing list,
  we are not the only ones to encounter it.
  
  However, I am not completely comfortable with mixing versions and
  patching the classpath this way.  It looks to me like the Spark assembly
  used in CDH5 has consistent versions, and I would like to try updating
  the CDH version in sahara-image-elments to CDH5 for Spark. If this fixes
  the problem and removes the need for the extra classpath, that would be
  great.
  
  Would someone like to take on this change? (modifying sahara-image-elements
  to use CDH5 for Spark images) I can make a blueprint for
  it.
  
  More to come about Oozie topics.
  
  Best regards,
  
  Trevor
  
  On Thu, 2015-01-15 at 15:34 +, Chen, Weiting wrote:
   Hi Mckay.
   

   
   We are Intel team and contributing OpenStack Sahara project.
   
   We are new in Sahara and would like to do more contributions in this
   project.
   
   So far, we are focusing on Sahara CDH Plugin.
   
   So if there is any issues related on this, please feel free to discuss
   with us.
   

   
   During IRC meeting, there are two issues you mentioned and we would
   like to discuss with you.
   
   1.  Oozie Workflow Support: 
   
   Do you have any plan could share with us about your idea?
   
   Because in our case, we are testing to run a java action job with
   HBase library support and also facing some problems about Oozie
   support.
   
   So it should be good to share the experience with each other.
   
   
   
   2.  Spark CDH Issues: 
   
   Could you provide more information about this issue? In CDH Plugin, we
   have used CDH 5 to finish swift test. So it should be fine to upgrade
   CDH 4 to 5.
   

   
   
  
  
  
 



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev