If there are no objections I'll merge in my blur client shell submission, sound good?
Patrick On Wed, Oct 24, 2012 at 11:27 AM, Aaron McCurry <[email protected]> wrote: > +1 on this solution. > > On Wed, Oct 24, 2012 at 2:24 PM, Patrick Hunt <[email protected]> wrote: >> Hi Gagan, I did find the cause, but not a good solution. Relying on >> everyone to set their umask is going to be onerous. It would be great >> if you could provide a proper solution - the one you suggested sounds >> good. >> >> Regards, >> >> Patrick >> >> On Tue, Oct 23, 2012 at 11:53 PM, Gagan Juneja >> <[email protected]> wrote: >>> Oops! I missed Patrick's last post. >>> >>> On Wed, Oct 24, 2012 at 12:07 PM, Gagan Juneja >>> <[email protected]>wrote: >>> >>>> I have simulated this issue on ubuntu box. I found that by default ubuntu >>>> creates directory with *775 *permissions. And there is one property in >>>> Hadoop Configuration named "dfs.datanode.data.dir.perm" and default value >>>> for this is *755*. Somewhere in code permissions for data directories are >>>> verified and it fails there and then. >>>> >>>> If we set this property in Configuration object with value *775,* all the >>>> test cases are passing and build is Successful. >>>> >>>> We can set this in *startDfs* method of >>>> *org.apache.blur.MiniCluster*class. Please verify this, if problem got >>>> resolved at your end then I can >>>> provide patch for this. >>>> >>>> Regards, >>>> Gagan >>>> >>>> >>>> >>>> On Wed, Oct 24, 2012 at 4:32 AM, Patrick Hunt <[email protected]> wrote: >>>> >>>>> Pushed a small cleanup to move all test file output into respective >>>>> target directories and use absolute paths for test file locations. >>>>> >>>>> I thought this might fix the BlurClusterTest however that's not the case: >>>>> >>>>> Starting DataNode 0 with dfs.data.dir: >>>>> >>>>> /home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data1,/home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data2 >>>>> ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All >>>>> directories in dfs.data.dir are invalid. >>>>> ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All >>>>> directories in dfs.data.dir are invalid. >>>>> ERROR 20121023_15:58:10:010_PDT [main] blur.MiniCluster: error opening >>>>> file system >>>>> java.lang.NullPointerException >>>>> at >>>>> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:422) >>>>> at >>>>> org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:280) >>>>> at >>>>> org.apache.hadoop.hdfs.MiniDFSCluster.<init>(MiniDFSCluster.java:124) >>>>> >>>>> Patrick >>>>> >>>>> On Tue, Oct 23, 2012 at 2:43 PM, Patrick Hunt <[email protected]> wrote: >>>>> > I pushed a small cleanup to versioning in the poms. >>>>> > >>>>> > Patrick >>>>> > >>>>> > On Tue, Oct 23, 2012 at 2:38 PM, Patrick Hunt <[email protected]> wrote: >>>>> >> I'll work on fixing the tmp issue, that's something I can handle. ;-) >>>>> >> Everything should be in target. >>>>> >> >>>>> >> Patrick >>>>> >> >>>>> >> On Tue, Oct 23, 2012 at 2:37 PM, Aaron McCurry <[email protected]> >>>>> wrote: >>>>> >>> Hmm, I will take a look at that one next. >>>>> >>> >>>>> >>> Aaron >>>>> >>> >>>>> >>> On Tue, Oct 23, 2012 at 5:20 PM, Patrick Hunt <[email protected]> >>>>> wrote: >>>>> >>>> Thanks Aaron. The other failing test "BlurClusterTest" is somehow due >>>>> >>>> to the directory used. "./tmp/cluster". If I change to >>>>> >>>> "file://tmp/cluster" the test passes. Any ideas? Seems somehow >>>>> related >>>>> >>>> to using relative paths? >>>>> >>>> >>>>> >>>> Patrick >>>>> >>>> >>>>> >>>> On Tue, Oct 23, 2012 at 2:13 PM, Aaron McCurry <[email protected]> >>>>> wrote: >>>>> >>>>> Found it, the test did not setup the indexing options correctly. I >>>>> >>>>> have committed a fix for the test. >>>>> >>>>> >>>>> >>>>> Aaron >>>>> >>>>> >>>>> >>>>> On Tue, Oct 23, 2012 at 5:08 PM, Aaron McCurry <[email protected]> >>>>> wrote: >>>>> >>>>>> After cleaning up the test, I have gotten the same NPE. Strange >>>>> >>>>>> behavior, still working on why. >>>>> >>>>>> >>>>> >>>>>> Aaron >>>>> >>>>>> >>>>> >>>>>> On Tue, Oct 23, 2012 at 3:06 PM, Patrick Hunt <[email protected]> >>>>> wrote: >>>>> >>>>>>> NP. here's the output. I'm on ubuntu 12.04. 1.6.0_26 >>>>> >>>>>>> >>>>> >>>>>>> "mvn clean test" results in: (I also removed the tmp directories >>>>> >>>>>>> manually, btw, we should move this to mvn target dir) >>>>> >>>>>>> >>>>> >>>>>>> >>>>> ------------------------------------------------------------------------------- >>>>> >>>>>>> Test set: org.apache.blur.utils.TermDocIterableTest >>>>> >>>>>>> >>>>> ------------------------------------------------------------------------------- >>>>> >>>>>>> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: >>>>> 0.005 >>>>> >>>>>>> sec <<< FAILURE! >>>>> >>>>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest) >>>>> Time >>>>> >>>>>>> elapsed: 0.005 sec <<< ERROR! >>>>> >>>>>>> java.lang.NullPointerException >>>>> >>>>>>> at >>>>> org.apache.blur.utils.TermDocIterable.getNext(TermDocIterable.java:82) >>>>> >>>>>>> at >>>>> org.apache.blur.utils.TermDocIterable.access$000(TermDocIterable.java:29) >>>>> >>>>>>> at >>>>> org.apache.blur.utils.TermDocIterable$1.<init>(TermDocIterable.java:48) >>>>> >>>>>>> at >>>>> org.apache.blur.utils.TermDocIterable.iterator(TermDocIterable.java:47) >>>>> >>>>>>> at >>>>> org.apache.blur.utils.TermDocIterableTest.testTermDocIterable(TermDocIterableTest.java:65) >>>>> >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native >>>>> Method) >>>>> >>>>>>> at >>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>> >>>>>>> at >>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>> >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>> >>>>>>> at >>>>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44) >>>>> >>>>>>> at >>>>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) >>>>> >>>>>>> at >>>>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41) >>>>> >>>>>>> at >>>>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) >>>>> >>>>>>> at >>>>> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) >>>>> >>>>>>> at >>>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76) >>>>> >>>>>>> at >>>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) >>>>> >>>>>>> at >>>>> org.junit.runners.ParentRunner$3.run(ParentRunner.java:193) >>>>> >>>>>>> at >>>>> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52) >>>>> >>>>>>> at >>>>> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191) >>>>> >>>>>>> at >>>>> org.junit.runners.ParentRunner.access$000(ParentRunner.java:42) >>>>> >>>>>>> at >>>>> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184) >>>>> >>>>>>> at >>>>> org.junit.runners.ParentRunner.run(ParentRunner.java:236) >>>>> >>>>>>> at >>>>> org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53) >>>>> >>>>>>> at >>>>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123) >>>>> >>>>>>> at >>>>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104) >>>>> >>>>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native >>>>> Method) >>>>> >>>>>>> at >>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>>>> >>>>>>> at >>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>>>> >>>>>>> at java.lang.reflect.Method.invoke(Method.java:597) >>>>> >>>>>>> at >>>>> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164) >>>>> >>>>>>> at >>>>> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110) >>>>> >>>>>>> at >>>>> org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175) >>>>> >>>>>>> at >>>>> org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107) >>>>> >>>>>>> at >>>>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68) >>>>> >>>>>>> >>>>> >>>>>>> >>>>> >>>>>>> On Tue, Oct 23, 2012 at 12:02 PM, Aaron McCurry < >>>>> [email protected]> wrote: >>>>> >>>>>>>> Sorry, just missed that message. Hmm, I will look around and >>>>> try to >>>>> >>>>>>>> see if I can find something. Thanks. >>>>> >>>>>>>> >>>>> >>>>>>>> Aaron >>>>> >>>>>>>> >>>>> >>>>>>>> On Tue, Oct 23, 2012 at 2:59 PM, Patrick Hunt <[email protected]> >>>>> wrote: >>>>> >>>>>>>>> this is null in termdocsitertest >>>>> >>>>>>>>> >>>>> >>>>>>>>> DocsEnum termDocs = atomicReader.termDocsEnum(new >>>>> Term("id", >>>>> >>>>>>>>> Integer.toString(id))); >>>>> >>>>>>>>> >>>>> >>>>>>>>> due to fields() being null in termDocsEnum method >>>>> >>>>>>>>> >>>>> >>>>>>>>> I don't see why yet though. Given the segment file exists on the >>>>> >>>>>>>>> filesystem, etc... >>>>> >>>>>>>>> >>>>> >>>>>>>>> Patrick >>>>> >>>>>>>>> >>>>> >>>>>>>>> On Tue, Oct 23, 2012 at 11:50 AM, Aaron McCurry < >>>>> [email protected]> wrote: >>>>> >>>>>>>>>> Trying to reproduce on Ubuntu. >>>>> >>>>>>>>>> >>>>> >>>>>>>>>> On Tue, Oct 23, 2012 at 1:58 PM, Patrick Hunt < >>>>> [email protected]> wrote: >>>>> >>>>>>>>>>> Hm, I just updated and I'm seeing two errors (which is 1 less >>>>> issue >>>>> >>>>>>>>>>> than before): >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> >>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest) >>>>> >>>>>>>>>>> org.apache.blur.thrift.BlurClusterTest: >>>>> java.lang.NullPointerException >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Let me look and see if I can at least determine what the >>>>> underlying >>>>> >>>>>>>>>>> problems are. >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> Patrick >>>>> >>>>>>>>>>> >>>>> >>>>>>>>>>> On Tue, Oct 23, 2012 at 10:12 AM, Aaron McCurry < >>>>> [email protected]> wrote: >>>>> >>>>>>>>>>>> I ran into some errors with ZookeeperClusterStatusTest tests >>>>> and have >>>>> >>>>>>>>>>>> resolved the issues I found. All units tests pass on OSX, I >>>>> have not >>>>> >>>>>>>>>>>> had a chance to run them on Linux yet. I also fixed the >>>>> nasty NPE >>>>> >>>>>>>>>>>> exception on the BlurClusterTest (it was affecting the >>>>> functional >>>>> >>>>>>>>>>>> tests as well). I ran a few burn-in tests on a VM running a >>>>> 2 >>>>> >>>>>>>>>>>> controller + 3 shard server Blur cluster. The tests >>>>> included loaded >>>>> >>>>>>>>>>>> data as fast as possibly while running searches against that >>>>> data as >>>>> >>>>>>>>>>>> fast as possible. The tests ran without issue (basically >>>>> like they >>>>> >>>>>>>>>>>> did before the upgrade to Lucene 4). I feel like the code >>>>> is in a >>>>> >>>>>>>>>>>> good state at this point. I'm going to merge this code to >>>>> master and >>>>> >>>>>>>>>>>> create another branch to begin modifying the RPC API. >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Anyone have any objections? >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> Aaron >>>>> >>>>>>>>>>>> >>>>> >>>>>>>>>>>> On Mon, Oct 22, 2012 at 8:29 PM, Patrick Hunt < >>>>> [email protected]> wrote: >>>>> >>>>>>>>>>>>> On Mon, Oct 22, 2012 at 5:23 PM, Aaron McCurry < >>>>> [email protected]> wrote: >>>>> >>>>>>>>>>>>>> Hmm. >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> On Mon, Oct 22, 2012 at 8:17 PM, Patrick Hunt < >>>>> [email protected]> wrote: >>>>> >>>>>>>>>>>>>>> Sounds good to me. >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Not sure if anyone else is seeing this but the unit tests >>>>> are not >>>>> >>>>>>>>>>>>>>> passing for me on ubuntu. I see one failure and two >>>>> errors. >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Failed tests: >>>>> >>>>>>>>>>>>>>> >>>>> >>>>> testSafeModeSetInFuture(org.apache.blur.manager.clusterstatus.ZookeeperClusterStatusTest) >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> Haven't seen this. >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Tests in error: >>>>> >>>>>>>>>>>>>>> >>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest) >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> This either. >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> org.apache.blur.thrift.BlurClusterTest: >>>>> java.lang.NullPointerException >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> I think I have been seeing this one during some functional >>>>> tests. >>>>> >>>>>>>>>>>>>> Haven't figured out the cause yet, but it seems like it's >>>>> a nasty >>>>> >>>>>>>>>>>>>> threading problem. Because when I drop the mutate threads >>>>> back 1 >>>>> >>>>>>>>>>>>>> everything works fine. However the test was passing on >>>>> OSX. >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Just me or is this expected? >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> Not expected. I'm going to setup a VM on computer to run >>>>> tests in >>>>> >>>>>>>>>>>>>> Linux as well. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Ok. Let me know how it goes and I can try and debug it a >>>>> bit, although >>>>> >>>>>>>>>>>>> you're running much faster than I can at this point. ;-) >>>>> Definitely >>>>> >>>>>>>>>>>>> let me know if you can't reproduce it and I'll dig into it >>>>> for sure. >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>> Patrick >>>>> >>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> Patrick >>>>> >>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>> On Sun, Oct 21, 2012 at 10:38 AM, Aaron McCurry < >>>>> [email protected]> wrote: >>>>> >>>>>>>>>>>>>>>> We can fix the jira issues. >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> On Sun, Oct 21, 2012 at 1:36 PM, Garrett Barton >>>>> >>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>> >>>>>>>>>>>>>>>>> Sounds good to me Aaron, call it 0.2. Does that mess up >>>>> Jira if you have >>>>> >>>>>>>>>>>>>>>>> things scheduled against releases? >>>>> >>>>>>>>>>>>>>>>> On Oct 21, 2012 9:44 AM, "Aaron McCurry" < >>>>> [email protected]> wrote: >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>> Ok, I think it will be some time before all the >>>>> changes for the new >>>>> >>>>>>>>>>>>>>>>>> api are in place and fully functional. So perhaps we >>>>> should merge the >>>>> >>>>>>>>>>>>>>>>>> lucene-4.0.0 branch into master and fix whatever bugs >>>>> are found. I >>>>> >>>>>>>>>>>>>>>>>> did some system testing yesterday and only found one >>>>> big issue. There >>>>> >>>>>>>>>>>>>>>>>> seems to be a threading problem with the BlurAnalyzer. >>>>> If a single >>>>> >>>>>>>>>>>>>>>>>> instance is in use across multiple threads some weird >>>>> behaviors >>>>> >>>>>>>>>>>>>>>>>> happen. Otherwise everything else seems to work, >>>>> normally (I will >>>>> >>>>>>>>>>>>>>>>>> create a jira issue). >>>>> >>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>> If we do merge the lucene-4.0.0 branch, I feel like we >>>>> should change >>>>> >>>>>>>>>>>>>>>>>> the version to 0.2. The reason is, the indexes in >>>>> 0.1.x are not going >>>>> >>>>>>>>>>>>>>>>>> to be backwards compatible (at least not with out some >>>>> work). Does >>>>> >>>>>>>>>>>>>>>>>> anyone have any strong feelings on this? >>>>> >>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>> Aaron >>>>> >>>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>>> On Sat, Oct 20, 2012 at 10:10 PM, Gagan Juneja >>>>> >>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>> >>>>>>>>>>>>>>>>>> > I agree with Garrett. We can merge this branch to >>>>> the place from where we >>>>> >>>>>>>>>>>>>>>>>> > cut it. Again as Garrett said If we want to keep >>>>> only new api thing then >>>>> >>>>>>>>>>>>>>>>>> we >>>>> >>>>>>>>>>>>>>>>>> > can merge it to master as well. >>>>> >>>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>>> > Regards, >>>>> >>>>>>>>>>>>>>>>>> > Gagan >>>>> >>>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>>> > On Sat, Oct 20, 2012 at 9:50 PM, Garrett Barton < >>>>> >>>>>>>>>>>>>>>>>> [email protected]>wrote: >>>>> >>>>>>>>>>>>>>>>>> > >>>>> >>>>>>>>>>>>>>>>>> >> I guess it depends on if your planning a 1.4 >>>>> release with lucene 4. If >>>>> >>>>>>>>>>>>>>>>>> yes >>>>> >>>>>>>>>>>>>>>>>> >> then merge and work towards making everything >>>>> functional. If not then >>>>> >>>>>>>>>>>>>>>>>> leave >>>>> >>>>>>>>>>>>>>>>>> >> the 1.3.x in master for bug fixing or whatnot and >>>>> merge this branch into >>>>> >>>>>>>>>>>>>>>>>> >> the new api one. >>>>> >>>>>>>>>>>>>>>>>> >> On Oct 20, 2012 11:03 AM, "Aaron McCurry" < >>>>> [email protected]> wrote: >>>>> >>>>>>>>>>>>>>>>>> >> >>>>> >>>>>>>>>>>>>>>>>> >> > I think that we can merge the lucene-4.0.0 branch >>>>> back into the >>>>> >>>>>>>>>>>>>>>>>> >> > master, since tests and code are compiling. I >>>>> haven't done any >>>>> >>>>>>>>>>>>>>>>>> >> > functional testing yet, but if much of the RPC >>>>> and internals are going >>>>> >>>>>>>>>>>>>>>>>> >> > to change I think that it may be a waste of time >>>>> to test and fix >>>>> >>>>>>>>>>>>>>>>>> >> > everything that we are about to change. What do >>>>> others think? >>>>> >>>>>>>>>>>>>>>>>> >> > >>>>> >>>>>>>>>>>>>>>>>> >> > Aaron >>>>> >>>>>>>>>>>>>>>>>> >> > >>>>> >>>>>>>>>>>>>>>>>> >> >>>>> >>>>>>>>>>>>>>>>>> >>>>> >>>> >>>>
