[ 
https://issues.apache.org/jira/browse/MAHOUT-632?focusedWorklogId=992684&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-992684
 ]

ASF GitHub Bot logged work on MAHOUT-632:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Nov/25 20:43
            Start Date: 20/Nov/25 20:43
    Worklog Time Spent: 10m 
      Work Description: rawkintrevo commented on PR #633:
URL: https://github.com/apache/mahout/pull/633#issuecomment-3559975857

   Thanks for the contribution @shiavm006 !!




Issue Time Tracking
-------------------

    Worklog Id:     (was: 992684)
    Time Spent: 20m  (was: 10m)

>  PFPGrowth : Exceeded max jobconf size
> --------------------------------------
>
>                 Key: MAHOUT-632
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-632
>             Project: Mahout
>          Issue Type: Bug
>    Affects Versions: 0.4, 0.5
>            Reporter: Vipul Pandey
>            Assignee: Robin Anil
>            Priority: Major
>             Fix For: 0.6
>
>         Attachments: MAHOUT-632.patch
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> I'm getting this error right after startParallelCounting finishes :
> 11/03/21 19:06:40 INFO mapred.JobClient:     Map output records=164272900
> 11/03/21 19:06:40 INFO mapred.JobClient:     SPLIT_RAW_BYTES=2860
> 11/03/21 19:06:40 INFO mapred.JobClient:     Reduce input records=67087840
> 11/03/21 19:07:02 INFO pfpgrowth.PFPGrowth: No of Features: 1788471
> 11/03/21 19:07:09 WARN mapred.JobClient: Use GenericOptionsParser for
> parsing the arguments. Applications should implement Tool for the same.
> 11/03/21 19:07:12 INFO input.FileInputFormat: Total input paths to process :
> 20
> 11/03/21 19:07:17 INFO mapred.JobClient: Cleaning up the staging area
> hdfs://nccc001:54310/mnt/analytics/data/hadoop/tmp/mapred/staging/isapps/.staging/job_201103101218_0287
> Exception in thread "main" org.apache.hadoop.ipc.RemoteException:
> java.io.IOException: java.io.IOException: Exceeded max jobconf size:
> 72276915 limit: 52428800
> at org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:3759)
> at sun.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1416)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1412)
> Quoting Robin :  "I guess we just hit the limit of storing flist in the conf. 
> Moving it do the distributed cache should fix this."



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to