Hi, I was able to get blur started (shards and controllers). It worked straight away. Awesome. I have a few more questions. My apologies if some of the questions are naive.
1. I am not able to find the 'blur.*.hostname' properties in the blur.properties file, but these are listed in the readme file 2. There seems to be a lot of code. I greatly appreciate if someone can give me pointers before I dig through the codebase. Something like an architectural overview or a flow explaining how the search query is resolved. 3. How do you guys manage your development workspace with eclipse, git, and maven. This will definitely help me get a kickstart. 4. I started Hadoop (HDFS+MapReduce), Zookeeper, and Blur. What are the steps in actually using it. Where do we start? Also I am outlining the steps that I followed in getting blur to run and also I got a couple of errors during the build process which are also listed below. The overall build was successful though. Apache Blur Single Node Setup on Mac OS X Lion 1. Environment : Single Node Hadoop-1.0.4 and Zookeeper-3.4.5 2. Get the Blur code from Git using git clone https://git-wip-us.apache.org/repos/asf/incubator-blur.git 3. Checkout the branch 0.1.5 4. Run 'mvn clean install' from the 'src' directory as superuser 5. Extract the Blur tar.gz file from the 'target/' directory into a convenient location and set BLUR_HOME to this location and add it to .bash_profile 6. Go to the extracted folder and configure the $BLUR_HOME/config/blur-env.sh file. The two exports that are required: export JAVA_HOME=$(/usr/libexec/java_home) export HADOOP_HOME=/usr/local/hadoop 7. Setup the $BLUR_HOME/config/blur.properties file. The default site configuration: blur.zookeeper.connection=localhost blur.cluster.name=default 8. Start blur using $BLUR_HOME/bin/start-all.sh Errors during the build process : ERROR 20130430_22:47:42:042_PDT [IndexReader-Refresher] writer.BlurIndexRefresher: Unknown error org.apache.lucene.store.AlreadyClosedException: this Directory is closed at org.apache.lucene.store.Directory.ensureOpen(Directory.java:256) at org.apache.lucene.store.FSDirectory.listAll(FSDirectory.java:240) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:679) at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:630) at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:343) at org.apache.lucene.index.StandardDirectoryReader.isCurrent(StandardDirectoryReader.java:326) at org.apache.lucene.index.StandardDirectoryReader.doOpenNoWriter(StandardDirectoryReader.java:284) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:247) at org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:235) at org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:169) at org.apache.blur.manager.writer.BlurIndexReader.refresh(BlurIndexReader.java:82) at org.apache.blur.manager.writer.BlurIndexRefresher.refreshInternal(BlurIndexRefresher.java:70) at org.apache.blur.manager.writer.BlurIndexRefresher.run(BlurIndexRefresher.java:61) at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462) WARN 20130501_20:54:18:018_PDT [main] jmx.MBeanRegistry: Error during unregister javax.management.InstanceNotFoundException: org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:415) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:403) at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:507) at org.apache.zookeeper.jmx.MBeanRegistry.unregister(MBeanRegistry.java:115) at org.apache.zookeeper.jmx.MBeanRegistry.unregister(MBeanRegistry.java:132) at org.apache.zookeeper.server.ZooKeeperServer.unregisterJMX(ZooKeeperServer.java:443) at org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:436) at org.apache.zookeeper.server.NIOServerCnxnFactory.shutdown(NIOServerCnxnFactory.java:271) at org.apache.zookeeper.server.ZooKeeperServerMain.shutdown(ZooKeeperServerMain.java:127) at org.apache.blur.MiniCluster$ZooKeeperServerMainEmbedded.shutdown(MiniCluster.java:339) at org.apache.blur.MiniCluster.shutdownZooKeeper(MiniCluster.java:427) at org.apache.blur.MiniCluster.shutdownBlurCluster(MiniCluster.java:146) at org.apache.blur.thrift.BlurClusterTest.shutdownCluster(BlurClusterTest.java:81) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:37) at org.junit.runners.ParentRunner.run(ParentRunner.java:236) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74) - Rahul On Tue, Apr 30, 2013 at 9:29 PM, rahul challapalli < [email protected]> wrote: > Aaron, > > Thanks for your reply. I will sure let you know how it goes. > > - Rahul > > > On Tue, Apr 30, 2013 at 7:33 PM, Aaron McCurry <[email protected]> wrote: > >> Hi Rahul, >> >> Welcome! Blur is a young incubator project and with that there is not a >> lot of documentation. Yet. But we do have a lot of code. :-) >> >> Blur uses HDFS for storing indexes, MapReduce for bulk indexing, Thrift >> for >> RPC and ZooKeeper for state, and of course Lucene for search. Yes Blur >> can >> and should run along side a standard Hadoop install (MapReduce + HDFS). >> It >> currently works with the 1.0.x version or CDH3 from Cloudera. I'm sure we >> can get it to work with 2.0.x and CDH4, it just hasn't happen yet. >> However >> the only dependency to run Blur on a single machine is ZooKeeper. HDFS is >> required for a cluster. >> >> To get you started. >> >> git clone https://git-wip-us.apache.org/repos/asf/incubator-blur.git >> >> # we are currently focusing on getting 0.1.5 to a releasable state. >> git checkout 0.1.5 >> >> In the checkout you will find a README.md that is a bit out of date with >> the code examples but the general theme is correct. For more examples >> take >> a look at the blur-testsuite project, there are a lot of code examples in >> there to get you started. >> >> To build the project into a tarball that can be extracted and executed. >> >> run "mvn install" from the src/ directory. Once it has successfully >> executed all the tests and built everything you will find a tar.gz file in >> the target/ directory in the distribution project. >> >> Before you can run Blur, Apache ZooKeeper needs to be running. A default >> install will work. >> >> After extracting the Blur tar.gz file you should be able to run the >> bin/start-all.sh and it should start a Blur controller and a shard server >> on your local machine. >> >> I would love to hear how your initial compile and install goes, because we >> could use this thread and any information that is exchanged to create a >> nice little wiki page for 0.1.5. >> >> Thank! >> >> Aaron >> >> >> >> >> >> >> >> On Tue, Apr 30, 2013 at 2:17 PM, rahul challapalli < >> [email protected]> wrote: >> >> > Hi, >> > >> > I am new to blur and even ASF in terms of contributing back to a >> project. I >> > have decent knowledge about hadoop and mapreduce but completely new to >> > search. I come from a Java/PHP background. I am looking for some >> direction >> > in setting up blur on my local machine. I have a single node hadoop >> > installation on my Mac OS X Lion. Is it an issue if I have HDFS, >> MapReduce >> > daemons running alongside blur on the same machine. I would greatly >> > appreciate if you can refer me to some setup document as well as an >> insight >> > into the architecture of blur. Thank You. >> > >> > - Rahul >> > >> > >
