Re: Doubts related to Apache Blur

Naresh Yadav Wed, 27 Nov 2013 04:28:45 -0800

Thanks Garrett, that worked..now able to build blur with hadoop2.0, now i m
working on configuration part ..


NARESH

On Tue, Nov 26, 2013 at 9:59 PM, Garrett Barton <[email protected]>wrote:

> Yea you do need to compile blur with hadoop 2.x. To do that switch to the
> blur 0.2 branch and fire this maven command from the root blur dir:
>
> mvn clean package install -P\!hadoop-1x,cdh4-mr1
> -Dhadoop.version=2.0.0-mr1-cdh4.3.0 -DskipTests=true
>
> Should compile fine.
>
> The other changes are in shell scripts which you would have to recreate in
> the bat files.  Since hadoop 2.x is split into multiple dirs, hadoop, hdfs,
> mr, and conf, I basically added to the existing required hadoop_home env
> var with 3 additional optional ones for HDFS_HOME, MAPRED_HOME,
> HADOOP_CONF.  I also manually copied ant-1.6.5.jar into blur's lib folder.
> That is all that is required to make the thing work.  See this JIRA:
> https://issues.apache.org/jira/browse/BLUR-313
>
> Good luck with windows,
> ~Garrett
>
>
> On Tue, Nov 26, 2013 at 9:02 AM, Naresh Yadav <[email protected]>
> wrote:
>
> > *Garrett,*
> > I was able to start blur servers with hadoop1.2.1 but facing problem to
> > doing maven with haddop2.2.0 dependency
> >
> > Please help me with blur and hadoop 2.0 problems....So my hadoop 2.0 is
> up
> > and running....
> > Now i done
> >
> > git clone https://git-wip-us.apache.org/repos/asf/incubator-blur.git
> >
> > Then in pon.xml i changed <hadoop.version>1.2.1</hadoop.version>  to
> > <hadoop.version>2.2.0</hadoop.version>
> >
> > then i run
> >
> > mvn install -DskipTests -P distribution
> >
> > It is giving Error as
> >
> > [ERROR] Failed to execute goal on project blur-util: Could not resolve
> > dependencies for project
> > org.apache.blur:blur-util:jar:0.3.0-incubating-SNAPSHOT: Could not find
> > artifact org.apache.hadoop:hadoop-core:jar:2.2.0 in libdir
> > (file://D:\blursrc\incubator-blur-hadoop2..2.0\blur-util/../lib) -> [Help
> > 1]
> > [ERROR]
> > [ERROR] To see the full stack trace of the errors, re-run Maven with the
> -e
> > switch.
> > [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> > [ERROR]
> > [ERROR] For more information about the errors and possible solutions,
> > please read the following articles:
> > [ERROR] [Help 1]
> >
> >
> http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
> > [ERROR]
> > [ERROR] After correcting the problems, you can resume the build with the
> > command
> > [ERROR]   mvn <goals> -rf :blur-util
> >
> > Please send me all required changes for this to success...I am assuming
> to
> > use hadoop2.0 i would need to complie blur code
> > also with hadoop2.2.0 jars...
> >
> > NARESH
> >
> > On Thu, Nov 21, 2013 at 8:42 PM, Garrett Barton <
> [email protected]
> > >wrote:
> >
> > > Welcome aboard!
> > >
> > > I can answer a few:
> > >
> > > 1. Yes with some build flags and script tweaking I can help with. I am
> > > running it now.
> > >
> > > 2. You will have to make startup scripts for windows, and honestly I
> > could
> > > not tell you if Blur would even run in a windows environment.  Have you
> > > considered doing dev in a VM? Or running a VM on your windows machine
> at
> > > least for hosting the hadoop stack?
> > >
> > > 3. Are you familiar with lucene itself?  You must query against a
> column
> > > (ok not 100% true with blur but it seems like you have specified
> field1=x
> > > field2=y requirements) I am slightly confused with your queries as they
> > > have a mix of column names and values that are in different columns in
> > your
> > > example.
> > > Assuming your first query is cost:50 AND period:Nov13 AND pool1:Tag1
> then
> > > sure. If you meant any kind of cost, then you simple omit that from the
> > > query in the first place.
> > > Assuming your second query is (cost:50 OR cost:150) AND period:Dec13
> AND
> > > pool1:Tag1 AND pool2:Tag2 then sure that works too.
> > >
> > > For the most part, if you can write a pretty standard SQL statement to
> > > query for your data as if it was in a database, that can be duplicated
> > > inside Blur.
> > >
> > >
> > > Millions of rows will be fine.  A single table with the column names
> you
> > > have described is fine, you will have to come up with some kind of
> unique
> > > identifier for each row to load into Blur. (Like a primary key in a
> > > database)
> > >
> > > Let me know if you have any more questions. :)
> > >
> > > ~Garrett
> > >
> > >
> > > On Thu, Nov 21, 2013 at 5:38 AM, Naresh Yadav <[email protected]>
> > > wrote:
> > >
> > > > hi,
> > > >
> > > > I am just reading about Apache Blur from last one day..and i found it
> > > > perfect fit for my project. But i have some doubts :
> > > >
> > > > 1. Will i be able to Hadoop 2.0 existing cluster with Apache Blur
> > latest
> > > > version
> > > >
> > > > 2. My development enviornment is Windows and Hadoop 2.0 supports
> > windows
> > > > so   i have doubt will apache blur latest version will work on
> windows
> > > > smoothly..i will get startup scripts for windows.
> > > >
> > > > 3. Here is 4 rows of my data which i need to store in one table :
> > > >        Cost=50, Period=Nov13, Pool1=Tag1, Pool2=Tag2
> > > >        Cost=50, Period=Nov13, Pool1=Tag1, Pool2=Tag3
> > > >        Cost=150, Period=Dec13, Pool1=Tag1, Pool2=Tag2, Pool3=Tag3
> > > >        Cost=150, Period=Dec13, Pool1=Tag1, Pool2=Tag2, Pool3=Tag4
> > > >
> > > >    Query 1 : I need get all rows with
> > > >              Cost, Nov13, Tag1
> > > >    Query 2: get all rows with Cost, Dec13, Tag1, Tag2
> > > >      Will i be able to do perform such query if yes how should i
> design
> > > > this Blur table for this use case. Note : In this table there can be
> > > > million of rows with all historic data.
> > > >
> > > > Please help me, i am new to big data technologies..Your guidance will
> > > give
> > > > me direction to proceed..
> > > >
> > > > Thanks
> > > > Naresh
> > > >
> > >
> >
>

Re: Doubts related to Apache Blur

Reply via email to