[ 
https://issues.apache.org/jira/browse/KYLIN-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15270854#comment-15270854
 ] 

Richard Calaba edited comment on KYLIN-1650 at 5/4/16 3:40 PM:
---------------------------------------------------------------

In my case - it is always 4 reducers failing but 2 suceeding, details below:

Task Type          Total          Complete
Map                    1                    1
Reduce               3                    2

Attempt Type          Failed          Killed          Successful
Maps                              0          0                    1
Reduces                         4          0                    2


was (Author: [email protected]):
In my case - it is always 4 reducers failing but 2 suceeding, details below:

Task Type       Total       Complete
Map                     1           1
Reduce          3           2

Attempt Type    Failed  Killed  Successful
Maps                    0             0 1
Reduces                 4             0 2

> java.io.IOException: Filesystem closed - in Cube Build Step 2 (MapR)
> --------------------------------------------------------------------
>
>                 Key: KYLIN-1650
>                 URL: https://issues.apache.org/jira/browse/KYLIN-1650
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v1.5.0, v1.5.1
>         Environment: MapR 4.1
>            Reporter: Richard Calaba
>            Assignee: Shaofeng SHI
>            Priority: Critical
>
> Hello,
> I am facing error in Build Cube - Step 2 - 4 Reducers are ending with 
> excpetion java.io.IOException: Filesystem closed .... I have searched a lot 
> and tried various versions (1.5.1, 1.5.0 - binary / compiled from sources 
> ...) and always with same result ... all on MapR 4.1 Hadoop
> The problem seems related also to data ... because exactly same cube defined 
> in different Hive schema (DB) - with exactly same metadata - worked fine. BUT 
> trying to the the fact table and lookup table join (one lookup table only) on 
> the Hive tables as defined in the failing Cube Model works fine !!!! So if it 
> is data issue -> then it is strange that select count(*) from fact left outer 
> lookup works without issues ... 
> Attaching syslog of one failed reducers ... in case anyone gets an idea what 
> could be the cause of this weird error ...
> Log Type: stderr
> Log Length: 507
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/data/mapr/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/filecache/10/job.jar/job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> Log Type: stdout
> Log Length: 304
> spill values to disk...
> create file 
> /kylin/kylin_metadata/kylin-7c7f995d-4925-4fcc-b0bc-c4a97a547c2c/cube_xxxxx_model_150_2ndattempt/fact_distinct_columns/BILLDATE
> append file 
> /kylin/kylin_metadata/kylin-7c7f995d-4925-4fcc-b0bc-c4a97a547c2c/cube_xxxxx_model_150_2ndattempt/fact_distinct_columns/BILLDATE
> Log Type: syslog
> Log Length: 10167
> Showing 4096 bytes of 10167 total. Click here for the full log.
> es from disk
> 2016-05-03 18:38:11,275 INFO [main] 
> org.apache.hadoop.mapreduce.task.reduce.DirectShuffleMergeManagerImpl: 
> Merging 0 segments, 0 bytes from memory into reduce
> 2016-05-03 18:38:11,275 INFO [main] org.apache.hadoop.mapred.Merger: Merging 
> 1 sorted segments
> 2016-05-03 18:38:11,279 INFO [main] org.apache.hadoop.mapred.Merger: Down to 
> the last merge-pass, with 1 segments left of total size: 8605536 bytes
> 2016-05-03 18:38:11,322 INFO [main] 
> org.apache.hadoop.conf.Configuration.deprecation: mapred.skip.on is 
> deprecated. Instead, use mapreduce.job.skiprecords
> 2016-05-03 18:38:11,342 INFO [main] 
> org.apache.kylin.engine.mr.common.AbstractHadoopJob: The absolute path for 
> meta dir is 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,347 INFO [main] org.apache.kylin.common.KylinConfig: Use 
> KYLIN_CONF=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,352 INFO [main] org.apache.kylin.common.KylinConfigBase: 
> Kylin Config was updated with kylin.metadata.url : 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,371 INFO [main] org.apache.kylin.cube.CubeManager: 
> Initializing CubeManager with config 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,376 INFO [main] 
> org.apache.kylin.common.persistence.ResourceStore: Using metadata url 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
>  for resource store
> 2016-05-03 18:38:11,380 WARN [main] 
> org.apache.kylin.common.persistence.ResourceStore: Failed to load 
> ResourceStore impl class: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/filter/Filter
> 2016-05-03 18:38:11,772 INFO [main] org.apache.kylin.cube.CubeDescManager: 
> Initializing CubeDescManager with config 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,772 INFO [main] org.apache.kylin.cube.CubeDescManager: 
> Reloading Cube Metadata from folder 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta/cube_desc
> 2016-05-03 18:38:14,530 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.io.IOException: Filesystem closed
>       at com.mapr.fs.MapRFileSystem.checkOpen(MapRFileSystem.java:1469)
>       at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:555)
>       at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:649)
>       at com.mapr.fs.MapRFileSystem.delete(MapRFileSystem.java:1048)
>       at org.apache.hadoop.mapred.Merger$Segment.close(Merger.java:348)
>       at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:521)
>       at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:533)
>       at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:601)
>       at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:155)
>       at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
>       at 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
>       at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> 2016-05-03 18:38:14,536 INFO [main] org.apache.hadoop.mapred.Task: Runnning 
> cleanup for the task
> ALSO ADDING FULL SYSLOG:
> =========================
> Log Type: syslog
> Log Length: 10167
> 2016-05-03 18:38:05,984 WARN [main] org.apache.hadoop.conf.Configuration: 
> job.xml:an attempt to override final parameter: 
> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2016-05-03 18:38:05,986 WARN [main] org.apache.hadoop.conf.Configuration: 
> job.xml:an attempt to override final parameter: 
> mapreduce.job.end-notification.max.attempts;  Ignoring.
> 2016-05-03 18:38:06,213 INFO [main] 
> org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from 
> hadoop-metrics2.properties
> 2016-05-03 18:38:06,314 INFO [main] 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period 
> at 10 second(s).
> 2016-05-03 18:38:06,314 INFO [main] 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl: ReduceTask metrics system 
> started
> 2016-05-03 18:38:06,435 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> Executing with tokens:
> 2016-05-03 18:38:06,484 INFO [main] org.apache.hadoop.mapred.YarnChild: Kind: 
> mapreduce.job, Service: job_1461856462442_0134, Ident: 
> (org.apache.hadoop.mapreduce.security.token.JobTokenIdentifier@6bb31fae)
> 2016-05-03 18:38:06,536 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> Sleeping for 0ms before retrying again. Got null now.
> 2016-05-03 18:38:06,842 INFO [main] org.apache.hadoop.mapred.YarnChild: 
> mapreduce.cluster.local.dir for child: 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134
> 2016-05-03 18:38:06,999 WARN [main] org.apache.hadoop.conf.Configuration: 
> job.xml:an attempt to override final parameter: 
> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2016-05-03 18:38:07,000 WARN [main] org.apache.hadoop.conf.Configuration: 
> job.xml:an attempt to override final parameter: 
> mapreduce.job.end-notification.max.attempts;  Ignoring.
> 2016-05-03 18:38:07,481 INFO [main] org.apache.hadoop.mapred.Task: 
> mapOutputFile class: org.apache.hadoop.mapred.MapRFsOutputFile
> 2016-05-03 18:38:07,481 INFO [main] 
> org.apache.hadoop.conf.Configuration.deprecation: session.id is deprecated. 
> Instead, use dfs.metrics.session-id
> 2016-05-03 18:38:07,521 INFO [main] org.apache.hadoop.mapred.Task:  Using 
> ResourceCalculatorProcessTree : [ ]
> 2016-05-03 18:38:07,603 INFO [main] org.apache.hadoop.mapred.ReduceTask: 
> Using ShuffleConsumerPlugin: 
> org.apache.hadoop.mapreduce.task.reduce.DirectShuffle@728cacdc
> 2016-05-03 18:38:07,632 INFO [main] 
> org.apache.hadoop.mapreduce.task.reduce.DirectShuffleMergeManagerImpl: 
> MergerManager: memoryLimit=1503238528, maxSingleShuffleLimit=375809632, 
> mergeThreshold=992137472, ioSortFactor=256, memToMemMergeOutputsThreshold=256
> 2016-05-03 18:38:07,638 INFO [EventFetcher for fetching Map Completion 
> Events] org.apache.hadoop.mapreduce.task.reduce.DirectShuffleEventFetcher: 
> attempt_1461856462442_0134_r_000000_0 Thread started: EventFetcher for 
> fetching Map Completion Events
> 2016-05-03 18:38:07,656 INFO [EventFetcher for fetching Map Completion 
> Events] org.apache.hadoop.mapreduce.task.reduce.DirectShuffleEventFetcher: 
> attempt_1461856462442_0134_r_000000_0: Got 1 new map-outputs
> 2016-05-03 18:38:07,685 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,692 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,693 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,695 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,696 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,697 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,698 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,699 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,700 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,702 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,703 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,705 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,767 INFO [MapOutputCopier 
> task_1461856462442_0134_r_000000.0] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new decompressor [.snappy]
> 2016-05-03 18:38:07,770 INFO [MapOutputCopier 
> task_1461856462442_0134_r_000000.0] 
> org.apache.hadoop.mapreduce.task.reduce.DirectShuffleFetcher: fetcher#0 about 
> to shuffle output of map attempt_1461856462442_0134_m_000000_0 to MEMORY
> 2016-05-03 18:38:08,046 INFO [MapOutputCopier 
> task_1461856462442_0134_r_000000.0] 
> org.apache.hadoop.mapreduce.task.reduce.DirectShuffleMergeManagerImpl: 
> closeInMemoryFile -> map-output of size: 52779014, inMemoryMapOutputs.size() 
> -> 1, commitMemory -> 0, usedMemory ->52779014
> 2016-05-03 18:38:08,047 INFO [EventFetcher for fetching Map Completion 
> Events] org.apache.hadoop.mapreduce.task.reduce.DirectShuffleEventFetcher: 
> EventFetcher is interrupted.. Returning
> 2016-05-03 18:38:08,141 INFO [main] 
> org.apache.hadoop.mapreduce.task.reduce.DirectShuffleMergeManagerImpl: 
> finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
> 2016-05-03 18:38:08,151 INFO [main] org.apache.hadoop.mapred.Merger: Merging 
> 1 sorted segments
> 2016-05-03 18:38:08,152 INFO [main] org.apache.hadoop.mapred.Merger: Down to 
> the last merge-pass, with 1 segments left of total size: 52778982 bytes
> 2016-05-03 18:38:08,160 INFO [main] org.apache.hadoop.io.compress.CodecPool: 
> Got brand-new compressor [.snappy]
> 2016-05-03 18:38:11,273 INFO [main] 
> org.apache.hadoop.mapreduce.task.reduce.DirectShuffleMergeManagerImpl: Merged 
> 1 segments, 52779014 bytes to disk to satisfy reduce memory limit
> 2016-05-03 18:38:11,274 INFO [main] 
> org.apache.hadoop.mapreduce.task.reduce.DirectShuffleMergeManagerImpl: 
> Merging 1 files, 8644203 bytes from disk
> 2016-05-03 18:38:11,275 INFO [main] 
> org.apache.hadoop.mapreduce.task.reduce.DirectShuffleMergeManagerImpl: 
> Merging 0 segments, 0 bytes from memory into reduce
> 2016-05-03 18:38:11,275 INFO [main] org.apache.hadoop.mapred.Merger: Merging 
> 1 sorted segments
> 2016-05-03 18:38:11,279 INFO [main] org.apache.hadoop.mapred.Merger: Down to 
> the last merge-pass, with 1 segments left of total size: 8605536 bytes
> 2016-05-03 18:38:11,322 INFO [main] 
> org.apache.hadoop.conf.Configuration.deprecation: mapred.skip.on is 
> deprecated. Instead, use mapreduce.job.skiprecords
> 2016-05-03 18:38:11,342 INFO [main] 
> org.apache.kylin.engine.mr.common.AbstractHadoopJob: The absolute path for 
> meta dir is 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,347 INFO [main] org.apache.kylin.common.KylinConfig: Use 
> KYLIN_CONF=/tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,352 INFO [main] org.apache.kylin.common.KylinConfigBase: 
> Kylin Config was updated with kylin.metadata.url : 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,371 INFO [main] org.apache.kylin.cube.CubeManager: 
> Initializing CubeManager with config 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,376 INFO [main] 
> org.apache.kylin.common.persistence.ResourceStore: Using metadata url 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
>  for resource store
> 2016-05-03 18:38:11,380 WARN [main] 
> org.apache.kylin.common.persistence.ResourceStore: Failed to load 
> ResourceStore impl class: java.lang.NoClassDefFoundError: 
> org/apache/hadoop/hbase/filter/Filter
> 2016-05-03 18:38:11,772 INFO [main] org.apache.kylin.cube.CubeDescManager: 
> Initializing CubeDescManager with config 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta
> 2016-05-03 18:38:11,772 INFO [main] org.apache.kylin.cube.CubeDescManager: 
> Reloading Cube Metadata from folder 
> /tmp/hadoop-mapr/nm-local-dir/usercache/mapr/appcache/application_1461856462442_0134/container_1461856462442_0134_01_000003/meta/cube_desc
> 2016-05-03 18:38:14,530 WARN [main] org.apache.hadoop.mapred.YarnChild: 
> Exception running child : java.io.IOException: Filesystem closed
>       at com.mapr.fs.MapRFileSystem.checkOpen(MapRFileSystem.java:1469)
>       at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:555)
>       at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:649)
>       at com.mapr.fs.MapRFileSystem.delete(MapRFileSystem.java:1048)
>       at org.apache.hadoop.mapred.Merger$Segment.close(Merger.java:348)
>       at 
> org.apache.hadoop.mapred.Merger$MergeQueue.adjustPriorityQueue(Merger.java:521)
>       at org.apache.hadoop.mapred.Merger$MergeQueue.next(Merger.java:533)
>       at org.apache.hadoop.mapred.ReduceTask$4.next(ReduceTask.java:601)
>       at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKeyValue(ReduceContextImpl.java:155)
>       at 
> org.apache.hadoop.mapreduce.task.ReduceContextImpl.nextKey(ReduceContextImpl.java:121)
>       at 
> org.apache.hadoop.mapreduce.lib.reduce.WrappedReducer$Context.nextKey(WrappedReducer.java:302)
>       at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:170)
>       at 
> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:627)
>       at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:389)
>       at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:415)
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566)
>       at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
> 2016-05-03 18:38:14,536 INFO [main] org.apache.hadoop.mapred.Task: Runnning 
> cleanup for the task



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to