Let me ask this (in a slightly unrelated, but I believe semi-related)
another way.  When I run BuildForest with a large split size on a
large data set, I get task timeout unless I set mapred.task.timeout
pretty high.  It seems that I'm CPU bound while building the forest,
but nothing is updating the progress of the mapper.  Does this make
sense?  In what class is most of the CPU intensive action happening
when building a forest with partial implementation.  Everything I have
read seems to indicate that it is memory intensive, but I'm not seeing
that at all.

On Sat, Sep 8, 2012 at 1:54 PM, Nick Jordan <[email protected]> wrote:
> Actually, those are the command that work.  If I run the first command
> with: -Dmapred.max.split.size=21675370 (not the split size being 10x
> larger) that is when I get the failure running the TestForest job.
>
>
>
> On Sat, Sep 8, 2012 at 1:53 PM, Nick Jordan <[email protected]> wrote:
>> /usr/local/hadoop/hadoop-1.0.3/bin/hadoop jar
>> /usr/local/mahout/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
>> org.apache.mahout.classifier.df.mapreduce.BuildForest
>> -Dmapred.max.split.size=2167537 -oob -d
>> /securityWaitTime/wt_top_airports_2007-2012_learn.data -ds
>> /securityWaitTime/wt_top_airports_2007-2012.info -sl 5 -p -t 100
>>
>> /usr/local/hadoop/hadoop-1.0.3/bin/hadoop jar
>> /usr/local/mahout/examples/target/mahout-examples-0.8-SNAPSHOT-job.jar
>> org.apache.mahout.classifier.df.mapreduce.TestForest -i
>> /securityWaitTime/wt_top_airports_2007-2012_test.data -ds
>> /securityWaitTime/wt_top_airports_2007-2012.info -m ob -a -mr -o
>> predictions
>>
>> On Sat, Sep 8, 2012 at 5:05 AM, deneche abdelhakim <[email protected]> 
>> wrote:
>>> Could you copy/paste the exact commands you used to run the training and
>>> the testing ?
>>>
>>> On Fri, Sep 7, 2012 at 11:10 PM, Nick Jordan <[email protected]> wrote:
>>>
>>>> Any thoughts here?
>>>>
>>>> On Thu, Sep 6, 2012 at 7:00 AM, Nick Jordan <[email protected]> wrote:
>>>> > Same problem with the sequential classifier.  My guess is that this
>>>> > "corruption" is happening because of that particular setting as it is
>>>> > the only thing that I'm changing, but I have no idea how to
>>>> > investigate further.
>>>> >
>>>> > Nick
>>>> >
>>>> > On Thu, Sep 6, 2012 at 2:22 AM, Abdelhakim Deneche <[email protected]>
>>>> wrote:
>>>> >> Hi Nick,
>>>> >>
>>>> >> This is not a memory problem, the classifier tries to load the trained
>>>> forest but it's getting some unexpected values. This problem never occured
>>>> before! Could the forest files be corrupted ?
>>>> >>
>>>> >> Try training the forest once again, and this time use the sequential
>>>> classifier (don't use the -mr parameter) and see if the problem still
>>>> occurs.
>>>> >>
>>>> >>
>>>> >> On 5 sept. 2012, at 23:00, Nick Jordan <[email protected]> wrote:
>>>> >>
>>>> >>> Hello All,
>>>> >>>
>>>> >>> I'm playing around with decision forests using the partial
>>>> >>> implementation and my own data set.  I am getting an error with
>>>> >>> TestForest, but only for certain forests that I'm building with
>>>> >>> BuildForest.  Using the same descriptor and same build and test data
>>>> >>> sets I get no error if I set mapred.max.split.size=1890528 which is
>>>> >>> roughly 1/100th the size of the build data set.  I can build the
>>>> >>> forest and test the remaining data and get the results with no
>>>> >>> problem.  When I change the split size to 18905280, everything still
>>>> >>> appears to work fine when building the forest, but when I then try to
>>>> >>> test the remaining data I get the error below.
>>>> >>>
>>>> >>> I've dug around the code a little, but nothing stood out as to why the
>>>> >>> array would go out of bounds at that specific value.  One solution is
>>>> >>> to obviously not create partitions that large, but if it was a problem
>>>> >>> with me running out of memory I would have expected an out of memory
>>>> >>> error and not an index past the size the bounds of an array.  I'd
>>>> >>> obviously prefer larger partitions and thus less of them and can move
>>>> >>> running this job to something like EMR which should allow me to have
>>>> >>> more memory, but I want to understand the nature of the error.
>>>> >>>
>>>> >>> For what it is worth I'm running this on hadoop-1.0.3 and
>>>> mahout-0.8-SNAPSHOT
>>>> >>>
>>>> >>> Thanks.
>>>> >>>
>>>> >>> --
>>>> >>>
>>>> >>> 12/09/05 17:52:09 INFO mapred.JobClient: Task Id :
>>>> >>> attempt_201209031756_0008_m_000000_0, Status : FAILED
>>>> >>> java.lang.ArrayIndexOutOfBoundsException: 946827879
>>>> >>>        at org.apache.mahout.classifier.df.node.Node.read(Node.java:58)
>>>> >>>        at
>>>> org.apache.mahout.classifier.df.DecisionForest.readFields(DecisionForest.java:197)
>>>> >>>        at
>>>> org.apache.mahout.classifier.df.DecisionForest.read(DecisionForest.java:203)
>>>> >>>        at
>>>> org.apache.mahout.classifier.df.DecisionForest.load(DecisionForest.java:225)
>>>> >>>        at
>>>> org.apache.mahout.classifier.df.mapreduce.Classifier$CMapper.setup(Classifier.java:212)
>>>> >>>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>>>> >>>        at
>>>> org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
>>>> >>>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
>>>> >>>        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>> >>>        at java.security.AccessController.doPrivileged(Native Method)
>>>> >>>        at javax.security.auth.Subject.doAs(Subject.java:416)
>>>> >>>        at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
>>>> >>>        at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>

Reply via email to