Re: FP Growth saveAsTextFile

2015-05-20 Thread Xiangrui Meng
+user

If this was in cluster mode, you should provide a path on a shared file
system, e.g., HDFS, instead of a local path. If this is in local model, I'm
not sure what went wrong.

On Wed, May 20, 2015 at 2:09 PM, Eric Tanner 
wrote:

> Here is the stack trace. Thanks for looking at this.
>
> scala>
> model.freqItemsets.saveAsTextFile("c:///repository/trunk/Scala_210_wspace/fpGrowth/modelText1")
> 15/05/20 14:07:47 INFO SparkContext: Starting job: saveAsTextFile at
> :33
> 15/05/20 14:07:47 INFO DAGScheduler: Got job 15 (saveAsTextFile at
> :33) with 2 output partitions (allowLocal=false)
> 15/05/20 14:07:47 INFO DAGScheduler: Final stage: Stage 30(saveAsTextFile
> at :33)
> 15/05/20 14:07:47 INFO DAGScheduler: Parents of final stage: List(Stage 29)
> 15/05/20 14:07:47 INFO DAGScheduler: Missing parents: List()
> 15/05/20 14:07:47 INFO DAGScheduler: Submitting Stage 30
> (MapPartitionsRDD[21] at saveAsTextFile at :33), which has no
> missing parents
> 15/05/20 14:07:47 INFO MemoryStore: ensureFreeSpace(131288) called with
> curMem=724428, maxMem=278302556
> 15/05/20 14:07:47 INFO MemoryStore: Block broadcast_18 stored as values in
> memory (estimated size 128.2 KB, free 264.6 MB)
> 15/05/20 14:07:47 INFO MemoryStore: ensureFreeSpace(78995) called with
> curMem=855716, maxMem=278302556
> 15/05/20 14:07:47 INFO MemoryStore: Block broadcast_18_piece0 stored as
> bytes in memory (estimated size 77.1 KB, free 264.5 MB)
> 15/05/20 14:07:47 INFO BlockManagerInfo: Added broadcast_18_piece0 in
> memory on localhost:52396 (size: 77.1 KB, free: 265.1 MB)
> 15/05/20 14:07:47 INFO BlockManagerMaster: Updated info of block
> broadcast_18_piece0
> 15/05/20 14:07:47 INFO SparkContext: Created broadcast 18 from broadcast
> at DAGScheduler.scala:839
> 15/05/20 14:07:47 INFO DAGScheduler: Submitting 2 missing tasks from Stage
> 30 (MapPartitionsRDD[21] at saveAsTextFile at :33)
> 15/05/20 14:07:47 INFO TaskSchedulerImpl: Adding task set 30.0 with 2 tasks
> 15/05/20 14:07:47 INFO BlockManager: Removing broadcast 17
> 15/05/20 14:07:47 INFO TaskSetManager: Starting task 0.0 in stage 30.0
> (TID 33, localhost, PROCESS_LOCAL, 1056 bytes)
> 15/05/20 14:07:47 INFO BlockManager: Removing block broadcast_17_piece0
> 15/05/20 14:07:47 INFO MemoryStore: Block broadcast_17_piece0 of size 4737
> dropped from memory (free 277372582)
> 15/05/20 14:07:47 INFO TaskSetManager: Starting task 1.0 in stage 30.0
> (TID 34, localhost, PROCESS_LOCAL, 1056 bytes)
> 15/05/20 14:07:47 INFO BlockManagerInfo: Removed broadcast_17_piece0 on
> localhost:52396 in memory (size: 4.6 KB, free: 265.1 MB)
> 15/05/20 14:07:47 INFO Executor: Running task 1.0 in stage 30.0 (TID 34)
> 15/05/20 14:07:47 INFO Executor: Running task 0.0 in stage 30.0 (TID 33)
> 15/05/20 14:07:47 INFO BlockManagerMaster: Updated info of block
> broadcast_17_piece0
> 15/05/20 14:07:47 INFO BlockManager: Removing block broadcast_17
> 15/05/20 14:07:47 INFO MemoryStore: Block broadcast_17 of size 6696
> dropped from memory (free 277379278)
> 15/05/20 14:07:47 INFO ContextCleaner: Cleaned broadcast 17
> 15/05/20 14:07:47 INFO BlockManager: Removing broadcast 16
> 15/05/20 14:07:47 INFO BlockManager: Removing block broadcast_16_piece0
> 15/05/20 14:07:47 INFO MemoryStore: Block broadcast_16_piece0 of size 4737
> dropped from memory (free 277384015)
> 15/05/20 14:07:47 INFO BlockManagerInfo: Removed broadcast_16_piece0 on
> localhost:52396 in memory (size: 4.6 KB, free: 265.1 MB)
> 15/05/20 14:07:47 INFO BlockManagerMaster: Updated info of block
> broadcast_16_piece0
> 15/05/20 14:07:47 INFO BlockManager: Removing block broadcast_16
> 15/05/20 14:07:47 INFO MemoryStore: Block broadcast_16 of size 6696
> dropped from memory (free 277390711)
> 15/05/20 14:07:47 INFO ContextCleaner: Cleaned broadcast 16
> 15/05/20 14:07:47 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty
> blocks out of 2 blocks
> 15/05/20 14:07:47 INFO ShuffleBlockFetcherIterator: Started 0 remote
> fetches in 1 ms
> 15/05/20 14:07:47 INFO ShuffleBlockFetcherIterator: Getting 2 non-empty
> blocks out of 2 blocks
> 15/05/20 14:07:47 INFO ShuffleBlockFetcherIterator: Started 0 remote
> fetches in 0 ms
> 15/05/20 14:07:47 ERROR Executor: Exception in task 1.0 in stage 30.0 (TID
> 34)
> java.lang.NullPointerException
> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1010)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
> at org.apache.hadoop.util.Shell.run(Shell.java:455)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715)
> at org.apache.hadoop.util.Shell.execCommand(Shell.java:808)
> at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
> at
> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:656)
> at
> org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:490)
> at
> org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:462)
>  

Re: FP Growth saveAsTextFile

2015-05-20 Thread Xiangrui Meng
Could you post the stack trace? If you are using Spark 1.3 or 1.4, it
would be easier to save freq itemsets as a Parquet file. -Xiangrui

On Wed, May 20, 2015 at 12:16 PM, Eric Tanner
 wrote:
> I am having trouble with saving an FP-Growth model as a text file.  I can
> print out the results, but when I try to save the model I get a
> NullPointerException.
>
> model.freqItemsets.saveAsTextFile("c://fpGrowth/model")
>
> Thanks,
>
> Eric

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org