+1 on this solution.

On Wed, Oct 24, 2012 at 2:24 PM, Patrick Hunt <[email protected]> wrote:
> Hi Gagan, I did find the cause, but not a good solution. Relying on
> everyone to set their umask is going to be onerous. It would be great
> if you could provide a proper solution - the one you suggested sounds
> good.
>
> Regards,
>
> Patrick
>
> On Tue, Oct 23, 2012 at 11:53 PM, Gagan Juneja
> <[email protected]> wrote:
>> Oops! I missed Patrick's last post.
>>
>> On Wed, Oct 24, 2012 at 12:07 PM, Gagan Juneja 
>> <[email protected]>wrote:
>>
>>> I have simulated this issue on ubuntu box. I found that by default ubuntu
>>> creates directory with *775 *permissions. And there is one property in
>>> Hadoop Configuration named "dfs.datanode.data.dir.perm" and default value
>>> for this is *755*. Somewhere in code permissions for data directories are
>>> verified and it fails there and then.
>>>
>>> If we set this property in Configuration object with value *775,* all the
>>> test cases are passing and build is Successful.
>>>
>>> We can set this in *startDfs* method of  
>>> *org.apache.blur.MiniCluster*class. Please verify this, if problem got 
>>> resolved at your end then I can
>>> provide patch for this.
>>>
>>> Regards,
>>> Gagan
>>>
>>>
>>>
>>> On Wed, Oct 24, 2012 at 4:32 AM, Patrick Hunt <[email protected]> wrote:
>>>
>>>> Pushed a small cleanup to move all test file output into respective
>>>> target directories and use absolute paths for test file locations.
>>>>
>>>> I thought this might fix the BlurClusterTest however that's not the case:
>>>>
>>>> Starting DataNode 0 with dfs.data.dir:
>>>>
>>>> /home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data1,/home/phunt/dev/blur/src/blur-core/target/tmp/cluster/dfs/data/data2
>>>> ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All
>>>> directories in dfs.data.dir are invalid.
>>>> ERROR 20121023_15:58:10:010_PDT [main] datanode.DataNode: All
>>>> directories in dfs.data.dir are invalid.
>>>> ERROR 20121023_15:58:10:010_PDT [main] blur.MiniCluster: error opening
>>>> file system
>>>> java.lang.NullPointerException
>>>>         at
>>>> org.apache.hadoop.hdfs.MiniDFSCluster.startDataNodes(MiniDFSCluster.java:422)
>>>>         at
>>>> org.apache.hadoop.hdfs.MiniDFSCluster.&lt;init&gt;(MiniDFSCluster.java:280)
>>>>         at
>>>> org.apache.hadoop.hdfs.MiniDFSCluster.&lt;init&gt;(MiniDFSCluster.java:124)
>>>>
>>>> Patrick
>>>>
>>>> On Tue, Oct 23, 2012 at 2:43 PM, Patrick Hunt <[email protected]> wrote:
>>>> > I pushed a small cleanup to versioning in the poms.
>>>> >
>>>> > Patrick
>>>> >
>>>> > On Tue, Oct 23, 2012 at 2:38 PM, Patrick Hunt <[email protected]> wrote:
>>>> >> I'll work on fixing the tmp issue, that's something I can handle. ;-)
>>>> >> Everything should be in target.
>>>> >>
>>>> >> Patrick
>>>> >>
>>>> >> On Tue, Oct 23, 2012 at 2:37 PM, Aaron McCurry <[email protected]>
>>>> wrote:
>>>> >>> Hmm, I will take a look at that one next.
>>>> >>>
>>>> >>> Aaron
>>>> >>>
>>>> >>> On Tue, Oct 23, 2012 at 5:20 PM, Patrick Hunt <[email protected]>
>>>> wrote:
>>>> >>>> Thanks Aaron. The other failing test "BlurClusterTest" is somehow due
>>>> >>>> to the directory used. "./tmp/cluster". If I change to
>>>> >>>> "file://tmp/cluster" the test passes. Any ideas? Seems somehow
>>>> related
>>>> >>>> to using relative paths?
>>>> >>>>
>>>> >>>> Patrick
>>>> >>>>
>>>> >>>> On Tue, Oct 23, 2012 at 2:13 PM, Aaron McCurry <[email protected]>
>>>> wrote:
>>>> >>>>> Found it, the test did not setup the indexing options correctly.  I
>>>> >>>>> have committed a fix for the test.
>>>> >>>>>
>>>> >>>>> Aaron
>>>> >>>>>
>>>> >>>>> On Tue, Oct 23, 2012 at 5:08 PM, Aaron McCurry <[email protected]>
>>>> wrote:
>>>> >>>>>> After cleaning up the test, I have gotten the same NPE.  Strange
>>>> >>>>>> behavior, still working on why.
>>>> >>>>>>
>>>> >>>>>> Aaron
>>>> >>>>>>
>>>> >>>>>> On Tue, Oct 23, 2012 at 3:06 PM, Patrick Hunt <[email protected]>
>>>> wrote:
>>>> >>>>>>> NP. here's the output. I'm on ubuntu 12.04. 1.6.0_26
>>>> >>>>>>>
>>>> >>>>>>> "mvn clean test" results in: (I also removed the tmp directories
>>>> >>>>>>> manually, btw, we should move this to mvn target  dir)
>>>> >>>>>>>
>>>> >>>>>>>
>>>> -------------------------------------------------------------------------------
>>>> >>>>>>> Test set: org.apache.blur.utils.TermDocIterableTest
>>>> >>>>>>>
>>>> -------------------------------------------------------------------------------
>>>> >>>>>>> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed:
>>>> 0.005
>>>> >>>>>>> sec <<< FAILURE!
>>>> >>>>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest)
>>>>  Time
>>>> >>>>>>> elapsed: 0.005 sec  <<< ERROR!
>>>> >>>>>>> java.lang.NullPointerException
>>>> >>>>>>>         at
>>>> org.apache.blur.utils.TermDocIterable.getNext(TermDocIterable.java:82)
>>>> >>>>>>>         at
>>>> org.apache.blur.utils.TermDocIterable.access$000(TermDocIterable.java:29)
>>>> >>>>>>>         at
>>>> org.apache.blur.utils.TermDocIterable$1.<init>(TermDocIterable.java:48)
>>>> >>>>>>>         at
>>>> org.apache.blur.utils.TermDocIterable.iterator(TermDocIterable.java:47)
>>>> >>>>>>>         at
>>>> org.apache.blur.utils.TermDocIterableTest.testTermDocIterable(TermDocIterableTest.java:65)
>>>> >>>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>> Method)
>>>> >>>>>>>         at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>> >>>>>>>         at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>> >>>>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>> >>>>>>>         at
>>>> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
>>>> >>>>>>>         at
>>>> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
>>>> >>>>>>>         at
>>>> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
>>>> >>>>>>>         at
>>>> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
>>>> >>>>>>>         at
>>>> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
>>>> >>>>>>>         at
>>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:76)
>>>> >>>>>>>         at
>>>> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>>>> >>>>>>>         at
>>>> org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
>>>> >>>>>>>         at
>>>> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
>>>> >>>>>>>         at
>>>> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
>>>> >>>>>>>         at
>>>> org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
>>>> >>>>>>>         at
>>>> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
>>>> >>>>>>>         at
>>>> org.junit.runners.ParentRunner.run(ParentRunner.java:236)
>>>> >>>>>>>         at
>>>> org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
>>>> >>>>>>>         at
>>>> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:123)
>>>> >>>>>>>         at
>>>> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:104)
>>>> >>>>>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>>>> Method)
>>>> >>>>>>>         at
>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>> >>>>>>>         at
>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>> >>>>>>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>>> >>>>>>>         at
>>>> org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:164)
>>>> >>>>>>>         at
>>>> org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:110)
>>>> >>>>>>>         at
>>>> org.apache.maven.surefire.booter.SurefireStarter.invokeProvider(SurefireStarter.java:175)
>>>> >>>>>>>         at
>>>> org.apache.maven.surefire.booter.SurefireStarter.runSuitesInProcessWhenForked(SurefireStarter.java:107)
>>>> >>>>>>>         at
>>>> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:68)
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> On Tue, Oct 23, 2012 at 12:02 PM, Aaron McCurry <
>>>> [email protected]> wrote:
>>>> >>>>>>>> Sorry, just missed that message.  Hmm, I will look around and
>>>> try to
>>>> >>>>>>>> see if I can find something.  Thanks.
>>>> >>>>>>>>
>>>> >>>>>>>> Aaron
>>>> >>>>>>>>
>>>> >>>>>>>> On Tue, Oct 23, 2012 at 2:59 PM, Patrick Hunt <[email protected]>
>>>> wrote:
>>>> >>>>>>>>> this is null in termdocsitertest
>>>> >>>>>>>>>
>>>> >>>>>>>>>         DocsEnum termDocs = atomicReader.termDocsEnum(new
>>>> Term("id",
>>>> >>>>>>>>> Integer.toString(id)));
>>>> >>>>>>>>>
>>>> >>>>>>>>> due to fields() being null in termDocsEnum method
>>>> >>>>>>>>>
>>>> >>>>>>>>> I don't see why yet though. Given the segment file exists on the
>>>> >>>>>>>>> filesystem, etc...
>>>> >>>>>>>>>
>>>> >>>>>>>>> Patrick
>>>> >>>>>>>>>
>>>> >>>>>>>>> On Tue, Oct 23, 2012 at 11:50 AM, Aaron McCurry <
>>>> [email protected]> wrote:
>>>> >>>>>>>>>> Trying to reproduce on Ubuntu.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> On Tue, Oct 23, 2012 at 1:58 PM, Patrick Hunt <
>>>> [email protected]> wrote:
>>>> >>>>>>>>>>> Hm, I just updated and I'm seeing two errors (which is 1 less
>>>> issue
>>>> >>>>>>>>>>> than before):
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest)
>>>> >>>>>>>>>>>   org.apache.blur.thrift.BlurClusterTest:
>>>> java.lang.NullPointerException
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> Let me look and see if I can at least determine what the
>>>> underlying
>>>> >>>>>>>>>>> problems are.
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> Patrick
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> On Tue, Oct 23, 2012 at 10:12 AM, Aaron McCurry <
>>>> [email protected]> wrote:
>>>> >>>>>>>>>>>> I ran into some errors with ZookeeperClusterStatusTest tests
>>>> and have
>>>> >>>>>>>>>>>> resolved the issues I found.  All units tests pass on OSX, I
>>>> have not
>>>> >>>>>>>>>>>> had a chance to run them on Linux yet.  I also fixed the
>>>> nasty NPE
>>>> >>>>>>>>>>>> exception on the BlurClusterTest (it was affecting the
>>>> functional
>>>> >>>>>>>>>>>> tests as well).  I ran a few burn-in tests on a VM running a
>>>> 2
>>>> >>>>>>>>>>>> controller + 3 shard server Blur cluster.  The tests
>>>> included loaded
>>>> >>>>>>>>>>>> data as fast as possibly while running searches against that
>>>> data as
>>>> >>>>>>>>>>>> fast as possible.  The tests ran without issue (basically
>>>> like they
>>>> >>>>>>>>>>>> did before the upgrade to Lucene 4).  I feel like the code
>>>> is in a
>>>> >>>>>>>>>>>> good state at this point.  I'm going to merge this code to
>>>> master and
>>>> >>>>>>>>>>>> create another branch to begin modifying the RPC API.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Anyone have any objections?
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Aaron
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> On Mon, Oct 22, 2012 at 8:29 PM, Patrick Hunt <
>>>> [email protected]> wrote:
>>>> >>>>>>>>>>>>> On Mon, Oct 22, 2012 at 5:23 PM, Aaron McCurry <
>>>> [email protected]> wrote:
>>>> >>>>>>>>>>>>>> Hmm.
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> On Mon, Oct 22, 2012 at 8:17 PM, Patrick Hunt <
>>>> [email protected]> wrote:
>>>> >>>>>>>>>>>>>>> Sounds good to me.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> Not sure if anyone else is seeing this but the unit tests
>>>> are not
>>>> >>>>>>>>>>>>>>> passing for me on ubuntu. I see one failure and two
>>>> errors.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> Failed tests:
>>>> >>>>>>>>>>>>>>>
>>>>  
>>>> testSafeModeSetInFuture(org.apache.blur.manager.clusterstatus.ZookeeperClusterStatusTest)
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Haven't seen this.
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> Tests in error:
>>>> >>>>>>>>>>>>>>>
>>>> testTermDocIterable(org.apache.blur.utils.TermDocIterableTest)
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> This either.
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>   org.apache.blur.thrift.BlurClusterTest:
>>>> java.lang.NullPointerException
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> I think I have been seeing this one during some functional
>>>> tests.
>>>> >>>>>>>>>>>>>> Haven't figured out the cause yet, but it seems like it's
>>>> a nasty
>>>> >>>>>>>>>>>>>> threading problem.  Because when I drop the mutate threads
>>>> back 1
>>>> >>>>>>>>>>>>>> everything works fine.  However the test was passing on
>>>> OSX.
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> Just me or is this expected?
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Not expected.  I'm going to setup a VM on computer to run
>>>> tests in
>>>> >>>>>>>>>>>>>> Linux as well.
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> Ok. Let me know how it goes and I can try and debug it a
>>>> bit, although
>>>> >>>>>>>>>>>>> you're running much faster than I can at this point. ;-)
>>>> Definitely
>>>> >>>>>>>>>>>>> let me know if you can't reproduce it and I'll dig into it
>>>> for sure.
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> Patrick
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> Patrick
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> On Sun, Oct 21, 2012 at 10:38 AM, Aaron McCurry <
>>>> [email protected]> wrote:
>>>> >>>>>>>>>>>>>>>> We can fix the jira issues.
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> On Sun, Oct 21, 2012 at 1:36 PM, Garrett Barton
>>>> >>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>> >>>>>>>>>>>>>>>>> Sounds good to me Aaron, call it 0.2. Does that mess up
>>>> Jira if you have
>>>> >>>>>>>>>>>>>>>>> things scheduled against releases?
>>>> >>>>>>>>>>>>>>>>> On Oct 21, 2012 9:44 AM, "Aaron McCurry" <
>>>> [email protected]> wrote:
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> Ok, I think it will be some time before all the
>>>> changes for the new
>>>> >>>>>>>>>>>>>>>>>> api are in place and fully functional.  So perhaps we
>>>> should merge the
>>>> >>>>>>>>>>>>>>>>>> lucene-4.0.0 branch into master and fix whatever bugs
>>>> are found.  I
>>>> >>>>>>>>>>>>>>>>>> did some system testing yesterday and only found one
>>>> big issue.  There
>>>> >>>>>>>>>>>>>>>>>> seems to be a threading problem with the BlurAnalyzer.
>>>>  If a single
>>>> >>>>>>>>>>>>>>>>>> instance is in use across multiple threads some weird
>>>> behaviors
>>>> >>>>>>>>>>>>>>>>>> happen.  Otherwise everything else seems to work,
>>>> normally (I will
>>>> >>>>>>>>>>>>>>>>>> create a jira issue).
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> If we do merge the lucene-4.0.0 branch, I feel like we
>>>> should change
>>>> >>>>>>>>>>>>>>>>>> the version to 0.2.  The reason is, the indexes in
>>>> 0.1.x are not going
>>>> >>>>>>>>>>>>>>>>>> to be backwards compatible (at least not with out some
>>>> work).  Does
>>>> >>>>>>>>>>>>>>>>>> anyone have any strong feelings on this?
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> Aaron
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> On Sat, Oct 20, 2012 at 10:10 PM, Gagan Juneja
>>>> >>>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>> >>>>>>>>>>>>>>>>>> > I agree with Garrett. We can merge this branch to
>>>> the place from where we
>>>> >>>>>>>>>>>>>>>>>> > cut it. Again as Garrett said If we want to keep
>>>> only new api thing then
>>>> >>>>>>>>>>>>>>>>>> we
>>>> >>>>>>>>>>>>>>>>>> > can merge it to master as well.
>>>> >>>>>>>>>>>>>>>>>> >
>>>> >>>>>>>>>>>>>>>>>> > Regards,
>>>> >>>>>>>>>>>>>>>>>> > Gagan
>>>> >>>>>>>>>>>>>>>>>> >
>>>> >>>>>>>>>>>>>>>>>> > On Sat, Oct 20, 2012 at 9:50 PM, Garrett Barton <
>>>> >>>>>>>>>>>>>>>>>> [email protected]>wrote:
>>>> >>>>>>>>>>>>>>>>>> >
>>>> >>>>>>>>>>>>>>>>>> >> I guess it depends on if your planning a 1.4
>>>> release with lucene 4. If
>>>> >>>>>>>>>>>>>>>>>> yes
>>>> >>>>>>>>>>>>>>>>>> >> then merge and work towards making everything
>>>> functional. If not then
>>>> >>>>>>>>>>>>>>>>>> leave
>>>> >>>>>>>>>>>>>>>>>> >> the 1.3.x in master for bug fixing or whatnot and
>>>> merge this branch into
>>>> >>>>>>>>>>>>>>>>>> >> the new api one.
>>>> >>>>>>>>>>>>>>>>>> >> On Oct 20, 2012 11:03 AM, "Aaron McCurry" <
>>>> [email protected]> wrote:
>>>> >>>>>>>>>>>>>>>>>> >>
>>>> >>>>>>>>>>>>>>>>>> >> > I think that we can merge the lucene-4.0.0 branch
>>>> back into the
>>>> >>>>>>>>>>>>>>>>>> >> > master, since tests and code are compiling.  I
>>>> haven't done any
>>>> >>>>>>>>>>>>>>>>>> >> > functional testing yet, but if much of the RPC
>>>> and internals are going
>>>> >>>>>>>>>>>>>>>>>> >> > to change I think that it may be a waste of time
>>>> to test and fix
>>>> >>>>>>>>>>>>>>>>>> >> > everything that we are about to change.  What do
>>>> others think?
>>>> >>>>>>>>>>>>>>>>>> >> >
>>>> >>>>>>>>>>>>>>>>>> >> > Aaron
>>>> >>>>>>>>>>>>>>>>>> >> >
>>>> >>>>>>>>>>>>>>>>>> >>
>>>> >>>>>>>>>>>>>>>>>>
>>>>
>>>
>>>

Reply via email to