Step (2) is straight forward. I downloaded 3.1.0 source code and did a build using "-Dhadoop.profile=2" option. I could successfully run few bulk loading scenarios using the client minimal jar created from this build.
On Mon, Sep 8, 2014 at 9:07 PM, James Taylor <jamestay...@apache.org> wrote: > Thanks, Puneet. That's super helpful. Was (2) difficult to do? That might > make an interesting blog if you're up for it. I'd be happy to post on your > behalf if that's helpful. > > Thanks, > James > > > On Monday, September 8, 2014, Puneet Kumar Ojha <puneet.ku...@pubmatic.com> > wrote: > >> See Comments Inline >> >> >> >> Thanks >> >> >> >> >> >> ------ Original message------ >> >> *From: *Krishna >> >> *Date: *Tue, Sep 9, 2014 5:24 AM >> >> *To: *user@phoenix.apache.org; >> >> *Subject:*Phoenix on Amazon EMR >> >> >> Hi, >> >> Does anyone have experience using Amazon EMR with Phoenix? I'm >> currently evaluating Phoenix for a HBase store on Amazon EMR. EMR provides >> Phoenix 2.1.2 as the default installation but I prefer to use 3x.---Use 3.x >> >> Could someone clarify the following with regards to 2.1.2? >> >> 1. Does this version support bulk-loading capabilities? We expect to >> load more than trillion rows, so, bulk-loader is a necessity. Use-Can >> Phoenix 2.1.2 run on either Hadoop1 or Hadoop2? -No. Use 3.x for mapreduce >> uploaded. >> 2. Did anyone try installing Phoenix 3x using EMR's bootstrap action >> capabilities?-Yes....it works.You will need to build client jar as per >> hadoop 2 version supported by AWS. >> 3. In the following arguments to the bulk loader, is port # required >> or optional? If I'm using Hadoop2, should Resource Manager node be >> substituted for Job Tracker? -Yes. You will see the port details >> when u login to emr cluster. >> 1. -hd <arg> HDFS NameNode IP:<port> >> 2. -mr <arg> MapReduce Job Tracker IP:<port> >> 3. -zk <arg> Zookeeper IP:<port> >> >> Thanks for your inputs. >> >> Krishna >> >>