Also some lines on another node : 14/09/30 10:22:31 ERROR nio.NioBlockTransferService: Exception handling buffer message java.io.IOException: Error in reading org.apache.spark.network.FileSegmentManagedBuffer(/mnt/DP_disk10/animal/spark/spark-local-20140930101701-c9ee/38/shuffle_6_162_0.data, 21118074, 544623) (actual file length 769648) at org.apache.spark.network.FileSegmentManagedBuffer.nioByteBuffer(ManagedBuffer.scala:80) at org.apache.spark.network.nio.NioBlockTransferService.getBlock(NioBlockTransferService.scala:203) at org.apache.spark.network.nio.NioBlockTransferService.org$apache$spark$network$nio$NioBlockTransferService$$processBlockMessage(NioBlockTransferService.scala:179) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$2.apply(NioBlockTransferService.scala:149) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$2.apply(NioBlockTransferService.scala:149) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at org.apache.spark.network.nio.BlockMessageArray.foreach(BlockMessageArray.scala:28) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at org.apache.spark.network.nio.BlockMessageArray.map(BlockMessageArray.scala:28) at org.apache.spark.network.nio.NioBlockTransferService.org$apache$spark$network$nio$NioBlockTransferService$$onBlockMessageReceive(NioBlockTransferService.scala:149) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$init$1.apply(NioBlockTransferService.scala:68) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$init$1.apply(NioBlockTransferService.scala:68) at org.apache.spark.network.nio.ConnectionManager.org$apache$spark$network$nio$ConnectionManager$$handleMessage(ConnectionManager.scala:677) at org.apache.spark.network.nio.ConnectionManager$$anon$10.run(ConnectionManager.scala:515) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Channel not open for writing - cannot extend file to required size at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:868) at org.apache.spark.network.FileSegmentManagedBuffer.nioByteBuffer(ManagedBuffer.scala:75) ... 20 more
From: Wang, Daoyuan Sent: Tuesday, September 30, 2014 11:20 AM To: Wang, Daoyuan; Reynold Xin Cc: user@spark.apache.org Subject: RE: SQL queries fail in 1.2.0-SNAPSHOT And the /mnt/DP_disk2/animal/spark/spark-local-20140930102549-622d/11/shuffle_6_191_0.data file is comparatively much smaller than other shuffle*.data files From: Wang, Daoyuan [mailto:daoyuan.w...@intel.com] Sent: Tuesday, September 30, 2014 10:54 AM To: Reynold Xin Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: RE: SQL queries fail in 1.2.0-SNAPSHOT Hi Reynold, Seems I am getting a much larger offset than file size. reading org.apache.spark.network.FileSegmentManagedBuffer(/mnt/DP_disk2/animal/spark/spark-local-20140930102549-622d/11/shuffle_6_191_0.data, 3154043, 588396) (actual file length 676025) at org.apache.spark.network.FileSegmentManagedBuffer.nioByteBuffer(ManagedBuffer.scala:80) at org.apache.spark.network.nio.NioBlockTransferService.getBlock(NioBlockTransferService.scala:203) at org.apache.spark.network.nio.NioBlockTransferService.org$apache$spark$network$nio$NioBlockTransferService$$processBlockMessage(NioBlockTransferService.scala:179) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$2.apply(NioBlockTransferService.scala:149) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$2.apply(NioBlockTransferService.scala:149) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) java.io.IOException: Error in reading org.apache.spark.network.FileSegmentManagedBuffer(/mnt/DP_disk2/animal/spark/spark-local-20140930102549-622d/11/shuffle_6_191_0.data, 5618712, 616204) (actual file length 676025) at org.apache.spark.network.FileSegmentManagedBuffer.nioByteBuffer(ManagedBuffer.scala:80) at org.apache.spark.network.nio.NioBlockTransferService.getBlock(NioBlockTransferService.scala:203) at org.apache.spark.network.nio.NioBlockTransferService.org$apache$spark$network$nio$NioBlockTransferService$$processBlockMessage(NioBlockTransferService.scala:179) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$2.apply(NioBlockTransferService.scala:149) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$2.apply(NioBlockTransferService.scala:149) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) These are random errors, in most but not every run. And also depend on queries. Thanks, Daoyuan From: Reynold Xin [mailto:r...@databricks.com] Sent: Tuesday, September 30, 2014 3:48 AM To: Wang, Daoyuan Cc: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Re: SQL queries fail in 1.2.0-SNAPSHOT Hi Daoyuan, Do you mind applying this patch and look at the exception again? https://github.com/apache/spark/pull/2580 It has also been merged in master so if you pull from master, you should have that. On Mon, Sep 29, 2014 at 1:17 AM, Wang, Daoyuan <daoyuan.w...@intel.com<mailto:daoyuan.w...@intel.com>> wrote: Hi all, I had some of my queries run on 1.1.0-SANPSHOT at commit b1b20301(Aug 24), but in current master branch, my queries would not work. I looked into the stderr file in executor, and find the following lines: 14/09/26 16:52:46 ERROR nio.NioBlockTransferService: Exception handling buffer message java.io.IOException: Channel not open for writing - cannot extend file to required size at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:868) at org.apache.spark.network.FileSegmentManagedBuffer.nioByteBuffer(ManagedBuffer.scala:73) at org.apache.spark.network.nio.NioBlockTransferService.getBlock(NioBlockTransferService.scala:203) at org.apache.spark.network.nio.NioBlockTransferService.org<http://org.apache.spark.network.nio.NioBlockTransferService.org>$apache$spark$network$nio$NioBlockTransferService$$processBlockMessage(NioBlockTransferService.scala:179) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$2.apply(NioBlockTransferService.scala:149) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$2.apply(NioBlockTransferService.scala:149) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at org.apache.spark.network.nio.BlockMessageArray.foreach(BlockMessageArray.scala:28) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at org.apache.spark.network.nio.BlockMessageArray.map(BlockMessageArray.scala:28) at org.apache.spark.network.nio.NioBlockTransferService.org<http://org.apache.spark.network.nio.NioBlockTransferService.org>$apache$spark$network$nio$NioBlockTransferService$$onBlockMessageReceive(NioBlockTransferService.scala:149) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$init$1.apply(NioBlockTransferService.scala:68) at org.apache.spark.network.nio.NioBlockTransferService$$anonfun$init$1.apply(NioBlockTransferService.scala:68) at org.apache.spark.network.nio.ConnectionManager.org<http://org.apache.spark.network.nio.ConnectionManager.org>$apache$spark$network$nio$ConnectionManager$$handleMessage(ConnectionManager.scala:677) at org.apache.spark.network.nio.ConnectionManager$$anon$10.run(ConnectionManager.scala:515) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Shuffle compress was turned off, because I encountered parsing_error when with shuffle compress. Even after I set the native library path, I got errors when uncompress in snappy. With shuffle compress turned off, I still get message above in some of my nodes, and the others would have a message that saying ack is not received after 60s. Any one get some ideas? Thanks for your help! Thanks, Daoyuan Wang