Thanks Garrett, that worked..now able to build blur with hadoop2.0, now i m working on configuration part ..
NARESH On Tue, Nov 26, 2013 at 9:59 PM, Garrett Barton <[email protected]>wrote: > Yea you do need to compile blur with hadoop 2.x. To do that switch to the > blur 0.2 branch and fire this maven command from the root blur dir: > > mvn clean package install -P\!hadoop-1x,cdh4-mr1 > -Dhadoop.version=2.0.0-mr1-cdh4.3.0 -DskipTests=true > > Should compile fine. > > The other changes are in shell scripts which you would have to recreate in > the bat files. Since hadoop 2.x is split into multiple dirs, hadoop, hdfs, > mr, and conf, I basically added to the existing required hadoop_home env > var with 3 additional optional ones for HDFS_HOME, MAPRED_HOME, > HADOOP_CONF. I also manually copied ant-1.6.5.jar into blur's lib folder. > That is all that is required to make the thing work. See this JIRA: > https://issues.apache.org/jira/browse/BLUR-313 > > Good luck with windows, > ~Garrett > > > On Tue, Nov 26, 2013 at 9:02 AM, Naresh Yadav <[email protected]> > wrote: > > > *Garrett,* > > I was able to start blur servers with hadoop1.2.1 but facing problem to > > doing maven with haddop2.2.0 dependency > > > > Please help me with blur and hadoop 2.0 problems....So my hadoop 2.0 is > up > > and running.... > > Now i done > > > > git clone https://git-wip-us.apache.org/repos/asf/incubator-blur.git > > > > Then in pon.xml i changed <hadoop.version>1.2.1</hadoop.version> to > > <hadoop.version>2.2.0</hadoop.version> > > > > then i run > > > > mvn install -DskipTests -P distribution > > > > It is giving Error as > > > > [ERROR] Failed to execute goal on project blur-util: Could not resolve > > dependencies for project > > org.apache.blur:blur-util:jar:0.3.0-incubating-SNAPSHOT: Could not find > > artifact org.apache.hadoop:hadoop-core:jar:2.2.0 in libdir > > (file://D:\blursrc\incubator-blur-hadoop2..2.0\blur-util/../lib) -> [Help > > 1] > > [ERROR] > > [ERROR] To see the full stack trace of the errors, re-run Maven with the > -e > > switch. > > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > > [ERROR] > > [ERROR] For more information about the errors and possible solutions, > > please read the following articles: > > [ERROR] [Help 1] > > > > > http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException > > [ERROR] > > [ERROR] After correcting the problems, you can resume the build with the > > command > > [ERROR] mvn <goals> -rf :blur-util > > > > Please send me all required changes for this to success...I am assuming > to > > use hadoop2.0 i would need to complie blur code > > also with hadoop2.2.0 jars... > > > > NARESH > > > > On Thu, Nov 21, 2013 at 8:42 PM, Garrett Barton < > [email protected] > > >wrote: > > > > > Welcome aboard! > > > > > > I can answer a few: > > > > > > 1. Yes with some build flags and script tweaking I can help with. I am > > > running it now. > > > > > > 2. You will have to make startup scripts for windows, and honestly I > > could > > > not tell you if Blur would even run in a windows environment. Have you > > > considered doing dev in a VM? Or running a VM on your windows machine > at > > > least for hosting the hadoop stack? > > > > > > 3. Are you familiar with lucene itself? You must query against a > column > > > (ok not 100% true with blur but it seems like you have specified > field1=x > > > field2=y requirements) I am slightly confused with your queries as they > > > have a mix of column names and values that are in different columns in > > your > > > example. > > > Assuming your first query is cost:50 AND period:Nov13 AND pool1:Tag1 > then > > > sure. If you meant any kind of cost, then you simple omit that from the > > > query in the first place. > > > Assuming your second query is (cost:50 OR cost:150) AND period:Dec13 > AND > > > pool1:Tag1 AND pool2:Tag2 then sure that works too. > > > > > > For the most part, if you can write a pretty standard SQL statement to > > > query for your data as if it was in a database, that can be duplicated > > > inside Blur. > > > > > > > > > Millions of rows will be fine. A single table with the column names > you > > > have described is fine, you will have to come up with some kind of > unique > > > identifier for each row to load into Blur. (Like a primary key in a > > > database) > > > > > > Let me know if you have any more questions. :) > > > > > > ~Garrett > > > > > > > > > On Thu, Nov 21, 2013 at 5:38 AM, Naresh Yadav <[email protected]> > > > wrote: > > > > > > > hi, > > > > > > > > I am just reading about Apache Blur from last one day..and i found it > > > > perfect fit for my project. But i have some doubts : > > > > > > > > 1. Will i be able to Hadoop 2.0 existing cluster with Apache Blur > > latest > > > > version > > > > > > > > 2. My development enviornment is Windows and Hadoop 2.0 supports > > windows > > > > so i have doubt will apache blur latest version will work on > windows > > > > smoothly..i will get startup scripts for windows. > > > > > > > > 3. Here is 4 rows of my data which i need to store in one table : > > > > Cost=50, Period=Nov13, Pool1=Tag1, Pool2=Tag2 > > > > Cost=50, Period=Nov13, Pool1=Tag1, Pool2=Tag3 > > > > Cost=150, Period=Dec13, Pool1=Tag1, Pool2=Tag2, Pool3=Tag3 > > > > Cost=150, Period=Dec13, Pool1=Tag1, Pool2=Tag2, Pool3=Tag4 > > > > > > > > Query 1 : I need get all rows with > > > > Cost, Nov13, Tag1 > > > > Query 2: get all rows with Cost, Dec13, Tag1, Tag2 > > > > Will i be able to do perform such query if yes how should i > design > > > > this Blur table for this use case. Note : In this table there can be > > > > million of rows with all historic data. > > > > > > > > Please help me, i am new to big data technologies..Your guidance will > > > give > > > > me direction to proceed.. > > > > > > > > Thanks > > > > Naresh > > > > > > > > > >
