+1 on this solution.
On Wed, Oct 24, 2012 at 2:24 PM, Patrick Hunt <[email protected]> wrote: > Hi Gagan, I did find the cause, but not a good solution. Relying on > everyone to set their umask is going to be onerous. It would be great > if you could provide a proper solution - the one you suggested sounds > good. > > Regards, > > Patrick > > On Tue, Oct 23, 2012 at 11:53 PM, Gagan Juneja > <[email protected]> wrote: >> Oops! I missed Patrick's last post. >> >> On Wed, Oct 24, 2012 at 12:07 PM, Gagan Juneja >> <[email protected]>wrote: >> >>> I have simulated this issue on ubuntu box. I found that by default ubuntu >>> creates directory with *775 *permissions. And there is one property in >>> Hadoop Configuration named "dfs.datanode.data.dir.perm" and default value >>> for this is *755*. Somewhere in code permissions for data directories are >>> verified and it fails there and then. >>> >>> If we set this property in Configuration object with value *775,* all the >>> test cases are passing and build is Successful. >>> >>> We can set this in *startDfs* method of >>> *org.apache.blur.MiniCluster*class. Please verify this, if problem got >>> resolved at your end then I can >>> provide patch for this. >>> >>> Regards, >>> Gagan >>> >>> >>> >>> On Wed, Oct 24, 2012 at 4:32 AM, Patrick Hunt <[email protected]> wrote: >>> >>>> Pushed a small cleanup to move all test file output into respective >>>> target directories and use absolute paths for test file locations. >>>> >>>> I thought this might fix the BlurClusterTest however that's not the case: >>>> >>>> Starting DataNode 0 with dfs.data.dir: >>>> >>>> /home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data1,/home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data2 >>>> ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All >>>> directories in dfs.data.dir are invalid. >>>> ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All >>>> directories in dfs.data.dir are invalid. >>>> ERROR 20121023_15:58:10:010_PDT [main] blur.MiniCluster: error opening >>>> file system >>>> java.lang.NullPointerException >>>> at >>>> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:422) >>>> at >>>> org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:280) >>>> at >>>> org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:124) >>>> >>>> Patrick >>>> >>>> On Tue, Oct 23, 2012 at 2:43 PM, Patrick Hunt <[email protected]> wrote: >>>> > I pushed a small cleanup to versioning in the poms. >>>> > >>>> > Patrick >>>> > >>>> > On Tue, Oct 23, 2012 at 2:38 PM, Patrick Hunt <[email protected]> wrote: >>>> >> I'll work on fixing the tmp issue, that's something I can handle. ;-) >>>> >> Everything should be in target. >>>> >> >>>> >> Patrick >>>> >> >>>> >> On Tue, Oct 23, 2012 at 2:37 PM, Aaron McCurry <[email protected]> >>>> wrote: >>>> >>> Hmm, I will take a look at that one next. >>>> >>> >>>> >>> Aaron >>>> >>> >>>> >>> On Tue, Oct 23, 2012 at 5:20 PM, Patrick Hunt <[email protected]> >>>> wrote: >>>> >>>> Thanks Aaron. The other failing test "BlurClusterTest" is somehow due >>>> >>>> to the directory used. "./tmp/cluster". If I change to >>>> >>>> "file://tmp/cluster" the test passes. Any ideas? Seems somehow >>>> related >>>> >>>> to using relative paths? >>>> >>>> >>>> >>>> Patrick >>>> >>>> >>>> >>>> On Tue, Oct 23, 2012 at 2:13 PM, Aaron McCurry <[email protected]> >>>> wrote: >>>> >>>>> Found it, the test did not setup the indexing options correctly. I >>>> >>>>> have committed a fix for the test. >>>> >>>>> >>>> >>>>> Aaron >>>> >>>>> >>>> >>>>> On Tue, Oct 23, 2012 at 5:08 PM, Aaron McCurry <[email protected]> >>>> wrote: >>>> >>>>>> After cleaning up the test, I have gotten the same NPE. Strange >>>> >>>>>> behavior, still working on why. >>>> >>>>>> >>>> >>>>>> Aaron >>>> >>>>>> >>>> >>>>>> On Tue, Oct 23, 2012 at 3:06 PM, Patrick Hunt <[email protected]> >>>> wrote: >>>> >>>>>>> NP. here's the output. I'm on ubuntu 12.04. 1.6.0_26 >>>> >>>>>>> >>>> >>>>>>> "mvn clean test" results in: (I also removed the tmp directories >>>> >>>>>>> manually, btw, we should move this to mvn target dir) >>>> >>>>>>> >>>> >>>>>>> >>>> ------------------------------------------------------------------------------- >>>> >>>>>>> Test set: org.apache.blur.utils.TermDocIterableTest >>>> >>>>>>> >>>> ------------------------------------------------------------------------------- >>>> >>>>>>> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: >>>> 0.005 >>>> >>>>>>> sec <<< FAILURE! >>>> >>>>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest) >>>> Time >>>> >>>>>>> elapsed: 0.005 sec <<< ERROR! >>>> >>>>>>> java.lang.NullPointerException >>>> >>>>>>> at >>>> org.apache.blur.utils.TermDocIterable.getNext(TermDocIterable.java:82) >>>> >>>>>>> at >>>> org.apache.blur.utils.TermDocIterable.access$000(TermDocIterable.java:29) >>>> >>>>>>> at >>>> org.apache.blur.utils.TermDocIterable$1.<init>(TermDocIterable.java:48) >>>> >>>>>>> at >>>> org.apache.blur.utils.TermDocIterable.iterator(TermDocIterable.java:47) >>>> >>>>>>> at >>>> org.apache.blur.utils.TermDocIterableTest.testTermDocIterable(TermDocIterableTest.java:65) >>>> >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native >>>> Method) >>>> >>>>>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>> >>>>>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>> >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>> >>>>>>> at >>>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) >>>> >>>>>>> at >>>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) >>>> >>>>>>> at >>>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) >>>> >>>>>>> at >>>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) >>>> >>>>>>> at >>>> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) >>>> >>>>>>> at >>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) >>>> >>>>>>> at >>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) >>>> >>>>>>> at >>>> org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) >>>> >>>>>>> at >>>> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) >>>> >>>>>>> at >>>> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) >>>> >>>>>>> at >>>> org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) >>>> >>>>>>> at >>>> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) >>>> >>>>>>> at >>>> org.junit.runners.ParentRunner.run(ParentRunner.java:236) >>>> >>>>>>> at >>>> org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53) >>>> >>>>>>> at >>>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123) >>>> >>>>>>> at >>>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104) >>>> >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native >>>> Method) >>>> >>>>>>> at >>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>> >>>>>>> at >>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>> >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>> >>>>>>> at >>>> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164) >>>> >>>>>>> at >>>> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110) >>>> >>>>>>> at >>>> org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175) >>>> >>>>>>> at >>>> org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107) >>>> >>>>>>> at >>>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68) >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> On Tue, Oct 23, 2012 at 12:02 PM, Aaron McCurry < >>>> [email protected]> wrote: >>>> >>>>>>>> Sorry, just missed that message. Hmm, I will look around and >>>> try to >>>> >>>>>>>> see if I can find something. Thanks. >>>> >>>>>>>> >>>> >>>>>>>> Aaron >>>> >>>>>>>> >>>> >>>>>>>> On Tue, Oct 23, 2012 at 2:59 PM, Patrick Hunt <[email protected]> >>>> wrote: >>>> >>>>>>>>> this is null in termdocsitertest >>>> >>>>>>>>> >>>> >>>>>>>>> DocsEnum termDocs = atomicReader.termDocsEnum(new >>>> Term("id", >>>> >>>>>>>>> Integer.toString(id))); >>>> >>>>>>>>> >>>> >>>>>>>>> due to fields() being null in termDocsEnum method >>>> >>>>>>>>> >>>> >>>>>>>>> I don't see why yet though. Given the segment file exists on the >>>> >>>>>>>>> filesystem, etc... >>>> >>>>>>>>> >>>> >>>>>>>>> Patrick >>>> >>>>>>>>> >>>> >>>>>>>>> On Tue, Oct 23, 2012 at 11:50 AM, Aaron McCurry < >>>> [email protected]> wrote: >>>> >>>>>>>>>> Trying to reproduce on Ubuntu. >>>> >>>>>>>>>> >>>> >>>>>>>>>> On Tue, Oct 23, 2012 at 1:58 PM, Patrick Hunt < >>>> [email protected]> wrote: >>>> >>>>>>>>>>> Hm, I just updated and I'm seeing two errors (which is 1 less >>>> issue >>>> >>>>>>>>>>> than before): >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> >>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest) >>>> >>>>>>>>>>> org.apache.blur.thrift.BlurClusterTest: >>>> java.lang.NullPointerException >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Let me look and see if I can at least determine what the >>>> underlying >>>> >>>>>>>>>>> problems are. >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> Patrick >>>> >>>>>>>>>>> >>>> >>>>>>>>>>> On Tue, Oct 23, 2012 at 10:12 AM, Aaron McCurry < >>>> [email protected]> wrote: >>>> >>>>>>>>>>>> I ran into some errors with ZookeeperClusterStatusTest tests >>>> and have >>>> >>>>>>>>>>>> resolved the issues I found. All units tests pass on OSX, I >>>> have not >>>> >>>>>>>>>>>> had a chance to run them on Linux yet. I also fixed the >>>> nasty NPE >>>> >>>>>>>>>>>> exception on the BlurClusterTest (it was affecting the >>>> functional >>>> >>>>>>>>>>>> tests as well). I ran a few burn-in tests on a VM running a >>>> 2 >>>> >>>>>>>>>>>> controller + 3 shard server Blur cluster. The tests >>>> included loaded >>>> >>>>>>>>>>>> data as fast as possibly while running searches against that >>>> data as >>>> >>>>>>>>>>>> fast as possible. The tests ran without issue (basically >>>> like they >>>> >>>>>>>>>>>> did before the upgrade to Lucene 4). I feel like the code >>>> is in a >>>> >>>>>>>>>>>> good state at this point. I'm going to merge this code to >>>> master and >>>> >>>>>>>>>>>> create another branch to begin modifying the RPC API. >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> Anyone have any objections? >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> Aaron >>>> >>>>>>>>>>>> >>>> >>>>>>>>>>>> On Mon, Oct 22, 2012 at 8:29 PM, Patrick Hunt < >>>> [email protected]> wrote: >>>> >>>>>>>>>>>>> On Mon, Oct 22, 2012 at 5:23 PM, Aaron McCurry < >>>> [email protected]> wrote: >>>> >>>>>>>>>>>>>> Hmm. >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> On Mon, Oct 22, 2012 at 8:17 PM, Patrick Hunt < >>>> [email protected]> wrote: >>>> >>>>>>>>>>>>>>> Sounds good to me. >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Not sure if anyone else is seeing this but the unit tests >>>> are not >>>> >>>>>>>>>>>>>>> passing for me on ubuntu. I see one failure and two >>>> errors. >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Failed tests: >>>> >>>>>>>>>>>>>>> >>>> >>>> testSafeModeSetInFuture(org.apache.blur.manager.clusterstatus.ZookeeperClusterStatusTest) >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> Haven't seen this. >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Tests in error: >>>> >>>>>>>>>>>>>>> >>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest) >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> This either. >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> org.apache.blur.thrift.BlurClusterTest: >>>> java.lang.NullPointerException >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> I think I have been seeing this one during some functional >>>> tests. >>>> >>>>>>>>>>>>>> Haven't figured out the cause yet, but it seems like it's >>>> a nasty >>>> >>>>>>>>>>>>>> threading problem. Because when I drop the mutate threads >>>> back 1 >>>> >>>>>>>>>>>>>> everything works fine. However the test was passing on >>>> OSX. >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Just me or is this expected? >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> Not expected. I'm going to setup a VM on computer to run >>>> tests in >>>> >>>>>>>>>>>>>> Linux as well. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Ok. Let me know how it goes and I can try and debug it a >>>> bit, although >>>> >>>>>>>>>>>>> you're running much faster than I can at this point. ;-) >>>> Definitely >>>> >>>>>>>>>>>>> let me know if you can't reproduce it and I'll dig into it >>>> for sure. >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>> Patrick >>>> >>>>>>>>>>>>> >>>> >>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> Patrick >>>> >>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>> On Sun, Oct 21, 2012 at 10:38 AM, Aaron McCurry < >>>> [email protected]> wrote: >>>> >>>>>>>>>>>>>>>> We can fix the jira issues. >>>> >>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>> On Sun, Oct 21, 2012 at 1:36 PM, Garrett Barton >>>> >>>>>>>>>>>>>>>> <[email protected]> wrote: >>>> >>>>>>>>>>>>>>>>> Sounds good to me Aaron, call it 0.2. Does that mess up >>>> Jira if you have >>>> >>>>>>>>>>>>>>>>> things scheduled against releases? >>>> >>>>>>>>>>>>>>>>> On Oct 21, 2012 9:44 AM, "Aaron McCurry" < >>>> [email protected]> wrote: >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>> Ok, I think it will be some time before all the >>>> changes for the new >>>> >>>>>>>>>>>>>>>>>> api are in place and fully functional. So perhaps we >>>> should merge the >>>> >>>>>>>>>>>>>>>>>> lucene-4.0.0 branch into master and fix whatever bugs >>>> are found. I >>>> >>>>>>>>>>>>>>>>>> did some system testing yesterday and only found one >>>> big issue. There >>>> >>>>>>>>>>>>>>>>>> seems to be a threading problem with the BlurAnalyzer. >>>> If a single >>>> >>>>>>>>>>>>>>>>>> instance is in use across multiple threads some weird >>>> behaviors >>>> >>>>>>>>>>>>>>>>>> happen. Otherwise everything else seems to work, >>>> normally (I will >>>> >>>>>>>>>>>>>>>>>> create a jira issue). >>>> >>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>> If we do merge the lucene-4.0.0 branch, I feel like we >>>> should change >>>> >>>>>>>>>>>>>>>>>> the version to 0.2. The reason is, the indexes in >>>> 0.1.x are not going >>>> >>>>>>>>>>>>>>>>>> to be backwards compatible (at least not with out some >>>> work). Does >>>> >>>>>>>>>>>>>>>>>> anyone have any strong feelings on this? >>>> >>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>> Aaron >>>> >>>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>>> On Sat, Oct 20, 2012 at 10:10 PM, Gagan Juneja >>>> >>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>> >>>>>>>>>>>>>>>>>> > I agree with Garrett. We can merge this branch to >>>> the place from where we >>>> >>>>>>>>>>>>>>>>>> > cut it. Again as Garrett said If we want to keep >>>> only new api thing then >>>> >>>>>>>>>>>>>>>>>> we >>>> >>>>>>>>>>>>>>>>>> > can merge it to master as well. >>>> >>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>> > Regards, >>>> >>>>>>>>>>>>>>>>>> > Gagan >>>> >>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>> > On Sat, Oct 20, 2012 at 9:50 PM, Garrett Barton < >>>> >>>>>>>>>>>>>>>>>> [email protected]>wrote: >>>> >>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>> >> I guess it depends on if your planning a 1.4 >>>> release with lucene 4. If >>>> >>>>>>>>>>>>>>>>>> yes >>>> >>>>>>>>>>>>>>>>>> >> then merge and work towards making everything >>>> functional. If not then >>>> >>>>>>>>>>>>>>>>>> leave >>>> >>>>>>>>>>>>>>>>>> >> the 1.3.x in master for bug fixing or whatnot and >>>> merge this branch into >>>> >>>>>>>>>>>>>>>>>> >> the new api one. >>>> >>>>>>>>>>>>>>>>>> >> On Oct 20, 2012 11:03 AM, "Aaron McCurry" < >>>> [email protected]> wrote: >>>> >>>>>>>>>>>>>>>>>> >> >>>> >>>>>>>>>>>>>>>>>> >> > I think that we can merge the lucene-4.0.0 branch >>>> back into the >>>> >>>>>>>>>>>>>>>>>> >> > master, since tests and code are compiling. I >>>> haven't done any >>>> >>>>>>>>>>>>>>>>>> >> > functional testing yet, but if much of the RPC >>>> and internals are going >>>> >>>>>>>>>>>>>>>>>> >> > to change I think that it may be a waste of time >>>> to test and fix >>>> >>>>>>>>>>>>>>>>>> >> > everything that we are about to change. What do >>>> others think? >>>> >>>>>>>>>>>>>>>>>> >> > >>>> >>>>>>>>>>>>>>>>>> >> > Aaron >>>> >>>>>>>>>>>>>>>>>> >> > >>>> >>>>>>>>>>>>>>>>>> >> >>>> >>>>>>>>>>>>>>>>>> >>>> >>> >>>
