Turns out this was the issue: https://issues.apache.org/jira/browse/HDFS-2556
Once I set the umask to 022 and then ran the tests they all ran fine. Note however that the ZooKeeperClusterStatusTest seems flakey, it fails occasionally for me. Patrick On Tue, Oct 23, 2012 at 4:02 PM, Patrick Hunt <[email protected]> wrote: > Pushed a small cleanup to move all test file output into respective > target directories and use absolute paths for test file locations. > > I thought this might fix the BlurClusterTest however that's not the case: > > Starting DataNode 0 with dfs.data.dir: > /home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data1,/home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data2 > ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All > directories in dfs.data.dir are invalid. > ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All > directories in dfs.data.dir are invalid. > ERROR 20121023_15:58:10:010_PDT [main] blur.MiniCluster: error opening > file system > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:422) > at > org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:280) > at > org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:124) > > Patrick > > On Tue, Oct 23, 2012 at 2:43 PM, Patrick Hunt <[email protected]> wrote: >> I pushed a small cleanup to versioning in the poms. >> >> Patrick >> >> On Tue, Oct 23, 2012 at 2:38 PM, Patrick Hunt <[email protected]> wrote: >>> I'll work on fixing the tmp issue, that's something I can handle. ;-) >>> Everything should be in target. >>> >>> Patrick >>> >>> On Tue, Oct 23, 2012 at 2:37 PM, Aaron McCurry <[email protected]> wrote: >>>> Hmm, I will take a look at that one next. >>>> >>>> Aaron >>>> >>>> On Tue, Oct 23, 2012 at 5:20 PM, Patrick Hunt <[email protected]> wrote: >>>>> Thanks Aaron. The other failing test "BlurClusterTest" is somehow due >>>>> to the directory used. "./tmp/cluster". If I change to >>>>> "file://tmp/cluster" the test passes. Any ideas? Seems somehow related >>>>> to using relative paths? >>>>> >>>>> Patrick >>>>> >>>>> On Tue, Oct 23, 2012 at 2:13 PM, Aaron McCurry <[email protected]> wrote: >>>>>> Found it, the test did not setup the indexing options correctly. I >>>>>> have committed a fix for the test. >>>>>> >>>>>> Aaron >>>>>> >>>>>> On Tue, Oct 23, 2012 at 5:08 PM, Aaron McCurry <[email protected]> >>>>>> wrote: >>>>>>> After cleaning up the test, I have gotten the same NPE. Strange >>>>>>> behavior, still working on why. >>>>>>> >>>>>>> Aaron >>>>>>> >>>>>>> On Tue, Oct 23, 2012 at 3:06 PM, Patrick Hunt <[email protected]> wrote: >>>>>>>> NP. here's the output. I'm on ubuntu 12.04. 1.6.0_26 >>>>>>>> >>>>>>>> "mvn clean test" results in: (I also removed the tmp directories >>>>>>>> manually, btw, we should move this to mvn target dir) >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------- >>>>>>>> Test set: org.apache.blur.utils.TermDocIterableTest >>>>>>>> ------------------------------------------------------------------------------- >>>>>>>> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.005 >>>>>>>> sec <<< FAILURE! >>>>>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest) Time >>>>>>>> elapsed: 0.005 sec <<< ERROR! >>>>>>>> java.lang.NullPointerException >>>>>>>> at >>>>>>>> org.apache.blur.utils.TermDocIterable.getNext(TermDocIterable.java:82) >>>>>>>> at >>>>>>>> org.apache.blur.utils.TermDocIterable.access$000(TermDocIterable.java:29) >>>>>>>> at >>>>>>>> org.apache.blur.utils.TermDocIterable$1.<init>(TermDocIterable.java:48) >>>>>>>> at >>>>>>>> org.apache.blur.utils.TermDocIterable.iterator(TermDocIterable.java:47) >>>>>>>> at >>>>>>>> org.apache.blur.utils.TermDocIterableTest.testTermDocIterable(TermDocIterableTest.java:65) >>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>>>> at >>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>>>>> at >>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>>>>> at >>>>>>>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) >>>>>>>> at >>>>>>>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) >>>>>>>> at >>>>>>>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) >>>>>>>> at >>>>>>>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) >>>>>>>> at >>>>>>>> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) >>>>>>>> at >>>>>>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) >>>>>>>> at >>>>>>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) >>>>>>>> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) >>>>>>>> at >>>>>>>> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) >>>>>>>> at >>>>>>>> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) >>>>>>>> at >>>>>>>> org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) >>>>>>>> at >>>>>>>> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) >>>>>>>> at org.junit.runners.ParentRunner.run(ParentRunner.java:236) >>>>>>>> at >>>>>>>> org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53) >>>>>>>> at >>>>>>>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123) >>>>>>>> at >>>>>>>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104) >>>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>>>>>>> at >>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>>>>> at >>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>>>>> at >>>>>>>> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164) >>>>>>>> at >>>>>>>> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110) >>>>>>>> at >>>>>>>> org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175) >>>>>>>> at >>>>>>>> org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107) >>>>>>>> at >>>>>>>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68) >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Oct 23, 2012 at 12:02 PM, Aaron McCurry <[email protected]> >>>>>>>> wrote: >>>>>>>>> Sorry, just missed that message. Hmm, I will look around and try to >>>>>>>>> see if I can find something. Thanks. >>>>>>>>> >>>>>>>>> Aaron >>>>>>>>> >>>>>>>>> On Tue, Oct 23, 2012 at 2:59 PM, Patrick Hunt <[email protected]> >>>>>>>>> wrote: >>>>>>>>>> this is null in termdocsitertest >>>>>>>>>> >>>>>>>>>> DocsEnum termDocs = atomicReader.termDocsEnum(new Term("id", >>>>>>>>>> Integer.toString(id))); >>>>>>>>>> >>>>>>>>>> due to fields() being null in termDocsEnum method >>>>>>>>>> >>>>>>>>>> I don't see why yet though. Given the segment file exists on the >>>>>>>>>> filesystem, etc... >>>>>>>>>> >>>>>>>>>> Patrick >>>>>>>>>> >>>>>>>>>> On Tue, Oct 23, 2012 at 11:50 AM, Aaron McCurry <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>>> Trying to reproduce on Ubuntu. >>>>>>>>>>> >>>>>>>>>>> On Tue, Oct 23, 2012 at 1:58 PM, Patrick Hunt <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>>> Hm, I just updated and I'm seeing two errors (which is 1 less issue >>>>>>>>>>>> than before): >>>>>>>>>>>> >>>>>>>>>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest) >>>>>>>>>>>> org.apache.blur.thrift.BlurClusterTest: >>>>>>>>>>>> java.lang.NullPointerException >>>>>>>>>>>> >>>>>>>>>>>> Let me look and see if I can at least determine what the underlying >>>>>>>>>>>> problems are. >>>>>>>>>>>> >>>>>>>>>>>> Patrick >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Oct 23, 2012 at 10:12 AM, Aaron McCurry >>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>> I ran into some errors with ZookeeperClusterStatusTest tests and >>>>>>>>>>>>> have >>>>>>>>>>>>> resolved the issues I found. All units tests pass on OSX, I have >>>>>>>>>>>>> not >>>>>>>>>>>>> had a chance to run them on Linux yet. I also fixed the nasty NPE >>>>>>>>>>>>> exception on the BlurClusterTest (it was affecting the functional >>>>>>>>>>>>> tests as well). I ran a few burn-in tests on a VM running a 2 >>>>>>>>>>>>> controller + 3 shard server Blur cluster. The tests included >>>>>>>>>>>>> loaded >>>>>>>>>>>>> data as fast as possibly while running searches against that data >>>>>>>>>>>>> as >>>>>>>>>>>>> fast as possible. The tests ran without issue (basically like >>>>>>>>>>>>> they >>>>>>>>>>>>> did before the upgrade to Lucene 4). I feel like the code is in a >>>>>>>>>>>>> good state at this point. I'm going to merge this code to master >>>>>>>>>>>>> and >>>>>>>>>>>>> create another branch to begin modifying the RPC API. >>>>>>>>>>>>> >>>>>>>>>>>>> Anyone have any objections? >>>>>>>>>>>>> >>>>>>>>>>>>> Aaron >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Oct 22, 2012 at 8:29 PM, Patrick Hunt <[email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> On Mon, Oct 22, 2012 at 5:23 PM, Aaron McCurry >>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>> Hmm. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Oct 22, 2012 at 8:17 PM, Patrick Hunt >>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>> Sounds good to me. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Not sure if anyone else is seeing this but the unit tests are >>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>> passing for me on ubuntu. I see one failure and two errors. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Failed tests: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> testSafeModeSetInFuture(org.apache.blur.manager.clusterstatus.ZookeeperClusterStatusTest) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Haven't seen this. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Tests in error: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This either. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> org.apache.blur.thrift.BlurClusterTest: >>>>>>>>>>>>>>>> java.lang.NullPointerException >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I think I have been seeing this one during some functional >>>>>>>>>>>>>>> tests. >>>>>>>>>>>>>>> Haven't figured out the cause yet, but it seems like it's a >>>>>>>>>>>>>>> nasty >>>>>>>>>>>>>>> threading problem. Because when I drop the mutate threads back >>>>>>>>>>>>>>> 1 >>>>>>>>>>>>>>> everything works fine. However the test was passing on OSX. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Just me or is this expected? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Not expected. I'm going to setup a VM on computer to run tests >>>>>>>>>>>>>>> in >>>>>>>>>>>>>>> Linux as well. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ok. Let me know how it goes and I can try and debug it a bit, >>>>>>>>>>>>>> although >>>>>>>>>>>>>> you're running much faster than I can at this point. ;-) >>>>>>>>>>>>>> Definitely >>>>>>>>>>>>>> let me know if you can't reproduce it and I'll dig into it for >>>>>>>>>>>>>> sure. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Patrick >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Patrick >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Sun, Oct 21, 2012 at 10:38 AM, Aaron McCurry >>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>> We can fix the jira issues. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Sun, Oct 21, 2012 at 1:36 PM, Garrett Barton >>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>> Sounds good to me Aaron, call it 0.2. Does that mess up Jira >>>>>>>>>>>>>>>>>> if you have >>>>>>>>>>>>>>>>>> things scheduled against releases? >>>>>>>>>>>>>>>>>> On Oct 21, 2012 9:44 AM, "Aaron McCurry" >>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Ok, I think it will be some time before all the changes for >>>>>>>>>>>>>>>>>>> the new >>>>>>>>>>>>>>>>>>> api are in place and fully functional. So perhaps we >>>>>>>>>>>>>>>>>>> should merge the >>>>>>>>>>>>>>>>>>> lucene-4.0.0 branch into master and fix whatever bugs are >>>>>>>>>>>>>>>>>>> found. I >>>>>>>>>>>>>>>>>>> did some system testing yesterday and only found one big >>>>>>>>>>>>>>>>>>> issue. There >>>>>>>>>>>>>>>>>>> seems to be a threading problem with the BlurAnalyzer. If >>>>>>>>>>>>>>>>>>> a single >>>>>>>>>>>>>>>>>>> instance is in use across multiple threads some weird >>>>>>>>>>>>>>>>>>> behaviors >>>>>>>>>>>>>>>>>>> happen. Otherwise everything else seems to work, normally >>>>>>>>>>>>>>>>>>> (I will >>>>>>>>>>>>>>>>>>> create a jira issue). >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> If we do merge the lucene-4.0.0 branch, I feel like we >>>>>>>>>>>>>>>>>>> should change >>>>>>>>>>>>>>>>>>> the version to 0.2. The reason is, the indexes in 0.1.x >>>>>>>>>>>>>>>>>>> are not going >>>>>>>>>>>>>>>>>>> to be backwards compatible (at least not with out some >>>>>>>>>>>>>>>>>>> work). Does >>>>>>>>>>>>>>>>>>> anyone have any strong feelings on this? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Aaron >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Sat, Oct 20, 2012 at 10:10 PM, Gagan Juneja >>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>> > I agree with Garrett. We can merge this branch to the >>>>>>>>>>>>>>>>>>> > place from where we >>>>>>>>>>>>>>>>>>> > cut it. Again as Garrett said If we want to keep only new >>>>>>>>>>>>>>>>>>> > api thing then >>>>>>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>> > can merge it to master as well. >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > Regards, >>>>>>>>>>>>>>>>>>> > Gagan >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > On Sat, Oct 20, 2012 at 9:50 PM, Garrett Barton < >>>>>>>>>>>>>>>>>>> [email protected]>wrote: >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> >> I guess it depends on if your planning a 1.4 release >>>>>>>>>>>>>>>>>>> >> with lucene 4. If >>>>>>>>>>>>>>>>>>> yes >>>>>>>>>>>>>>>>>>> >> then merge and work towards making everything >>>>>>>>>>>>>>>>>>> >> functional. If not then >>>>>>>>>>>>>>>>>>> leave >>>>>>>>>>>>>>>>>>> >> the 1.3.x in master for bug fixing or whatnot and merge >>>>>>>>>>>>>>>>>>> >> this branch into >>>>>>>>>>>>>>>>>>> >> the new api one. >>>>>>>>>>>>>>>>>>> >> On Oct 20, 2012 11:03 AM, "Aaron McCurry" >>>>>>>>>>>>>>>>>>> >> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>> >> > I think that we can merge the lucene-4.0.0 branch back >>>>>>>>>>>>>>>>>>> >> > into the >>>>>>>>>>>>>>>>>>> >> > master, since tests and code are compiling. I haven't >>>>>>>>>>>>>>>>>>> >> > done any >>>>>>>>>>>>>>>>>>> >> > functional testing yet, but if much of the RPC and >>>>>>>>>>>>>>>>>>> >> > internals are going >>>>>>>>>>>>>>>>>>> >> > to change I think that it may be a waste of time to >>>>>>>>>>>>>>>>>>> >> > test and fix >>>>>>>>>>>>>>>>>>> >> > everything that we are about to change. What do >>>>>>>>>>>>>>>>>>> >> > others think? >>>>>>>>>>>>>>>>>>> >> > >>>>>>>>>>>>>>>>>>> >> > Aaron >>>>>>>>>>>>>>>>>>> >> > >>>>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>>>>
