Hi Eugene, select count(*) from mls_public_record_association_snapshot_orc pra left outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id left outer join attribute a on a.id = ml.standard_status
ran till end and threw the below exception. MapReduce Total cumulative CPU time: 0 days 1 hours 0 minutes 53 seconds 760 msec Ended Job = job_1477494091659_0024 SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1.0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] 2016-10-26 16:09:01 Starting to launch local task to process map join; maximum memory = 514850816 Execution failed with exit status: 2 Obtaining error information Task failed! Task ID: Stage-9 Logs: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask MapReduce Jobs Launched: Stage-Stage-1: Map: 300 Reduce: 121 Cumulative CPU: 3654.02 sec HDFS Read: 1771032233 HDFS Write: 1917532703 SUCCESS Total MapReduce CPU Time Spent: 0 days 1 hours 0 minutes 54 seconds 20 msec Explain Plan: STAGE DEPENDENCIES: Stage-8 is a root stage , consists of Stage-1 Stage-1 Stage-9 depends on stages: Stage-1 Stage-3 depends on stages: Stage-9 Stage-0 depends on stages: Stage-3 STAGE PLANS: Stage: Stage-8 Conditional Operator Stage: Stage-1 Map Reduce Map Operator Tree: TableScan alias: pra Statistics: Num rows: 99241216 Data size: 9924121600 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: primary_listing_id (type: string) outputColumnNames: _col0 Statistics: Num rows: 99241216 Data size: 9924121600 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 99241216 Data size: 9924121600 Basic stats: COMPLETE Column stats: NONE TableScan alias: ml Statistics: Num rows: 201432950 Data size: 20949026816 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: id (type: string), standard_status (type: int) outputColumnNames: _col0, _col1 Statistics: Num rows: 201432950 Data size: 20949026816 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator key expressions: _col0 (type: string) sort order: + Map-reduce partition columns: _col0 (type: string) Statistics: Num rows: 201432950 Data size: 20949026816 Basic stats: COMPLETE Column stats: NONE value expressions: _col1 (type: int) Reduce Operator Tree: Join Operator condition map: Left Outer Join0 to 1 keys: 0 _col0 (type: string) 1 _col0 (type: string) outputColumnNames: _col2 Statistics: Num rows: 221576249 Data size: 23043929997 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazybinary.LazyBinarySerDe Stage: Stage-9 --it is failing in the same mapreduce local work. Map Reduce Local Work Alias -> Map Local Tables: $hdt$_2:a Fetch Operator limit: -1 Alias -> Map Local Operator Tree: $hdt$_2:a TableScan alias: a Statistics: Num rows: 12830 Data size: 51322 Basic stats: COMPLETE Column stats: NONE Select Operator expressions: id (type: int) outputColumnNames: _col0 Statistics: Num rows: 12830 Data size: 51322 Basic stats: COMPLETE Column stats: NONE HashTable Sink Operator keys: 0 _col2 (type: int) 1 _col0 (type: int) Stage: Stage-3 Map Reduce Map Operator Tree: TableScan Map Join Operator condition map: Left Outer Join0 to 1 keys: 0 _col2 (type: int) 1 _col0 (type: int) Statistics: Num rows: 243733879 Data size: 25348323546 Basic stats: COMPLETE Column stats: NONE Group By Operator aggregations: count() mode: hash outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE Reduce Output Operator sort order: Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE value expressions: _col0 (type: bigint) Local Work: Map Reduce Local Work Reduce Operator Tree: Group By Operator aggregations: count(VALUE._col0) mode: mergepartial outputColumnNames: _col0 Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE File Output Operator compressed: false Statistics: Num rows: 1 Data size: 8 Basic stats: COMPLETE Column stats: NONE table: input format: org.apache.hadoop.mapred.SequenceFileInputFormat output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe Stage: Stage-0 Fetch Operator limit: -1 Processor Tree: ListSink Any suggestion in debugging this issue is appreciated. Regards, Satyajit. On Wed, Oct 26, 2016 at 3:34 PM, Eugene Koifman <[email protected]> wrote: > If you can run this, then it’s safe to ignore “00094_flush_length” messages > and the issue is somewhere else > > select count(*) from mls_public_record_association_snapshot_orc pra left > outer join mls_listing_snapshot_orc ml on pra.primary_listing_id = ml.id left > outer join attribute a on a.id = ml.standard_status > > Eugene > > From: satyajit vegesna <[email protected]> > Date: Wednesday, October 26, 2016 at 2:14 PM > To: "[email protected]" <[email protected]>, Eugene Koifman < > [email protected]> > Cc: "[email protected]" <[email protected]> > Subject: Re: Error with flush_length File in Orc, in hive 2.1.0 and mr > execution engine. > > Hi Eugene, > > PFB Transaction table in green and parquet tables in yellow, > > INSERT INTO access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select > pra.url as prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) > as prCreated, CAST(pra.updated_at AS timestamp) as prUpdated, > CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS > timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId, > acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as > listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS > timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS > timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage, > datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS > timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS > timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when > (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '') > then (case when (pra.primary_listing_id is null or > TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case > when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id) > <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType, > acl.httpstatuscode, acl.httpverb, acl.requesttime, > acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent > AS user_agent, substring(split(pra.url,'/')[0], > 0,length(split(pra.url,'/')[0])-3) as city, substring(split(pra.url,'/')[0], > length(split(pra.url,'/')[0])-1,2) as state, ml.mls_id FROM > access_logs.loadbalancer_accesslogs acl inner join > mls_public_record_association_snapshot_orc pra on acl.listing_url = > pra.url left outer join mls_listing_snapshot_orc ml on > pra.primary_listing_id = ml.id left outer join attribute a on a.id = > ml.standard_status WHERE acl.accesstimedate="2016-10-23"; > > > Any clue, or something that you would want me to focus on to debug the > issue. > > Regards, > Satyajit. > > > > On Tue, Oct 25, 2016 at 8:49 PM, Eugene Koifman <[email protected]> > wrote: > >> Which of your tables are are transactional? Can you provide the DDL? >> >> I don’t think “File does not exist” error is causing your queries to >> fail. It’s an INFO level msg. >> There should be some other error. >> >> Eugene >> >> >> From: satyajit vegesna <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Tuesday, October 25, 2016 at 5:46 PM >> To: "[email protected]" <[email protected]>, "[email protected]" >> <[email protected]> >> Subject: Error with flush_length File in Orc, in hive 2.1.0 and mr >> execution engine. >> >> HI All, >> >> i am using hive 2.1.0 , hadoop 2.7.2 , but when i try running queries >> like simple insert, >> >> set mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set >> hive.exec.dynamic.partition.mode=nonstrict;set >> hive.exec.max.dynamic.partitions.pernode=400;set >> hive.exec.max.dynamic.partitions=2000;set mapreduce.map.memory.mb=5120;set >> mapreduce.reduce.memory.mb=5120;set >> mapred.tasktracker.map.tasks.maximum=30;set >> mapred.tasktracker.reduce.tasks.maximum=20;set >> mapred.reduce.child.java.opts=-Xmx2048m;set >> mapred.map.child.java.opts=-Xmx2048m; set hive.support.concurrency=true; >> set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; set >> hive.compactor.initiator.on=false; set hive.compactor.worker.threads=1;set >> mapreduce.job.queuename=default;set hive.exec.dynamic.partition=true;set >> hive.exec.dynamic.partition.mode=nonstrict;INSERT INTO >> access_logs.crawlstats_dpp PARTITION(day="2016-10-23") select pra.url as >> prUrl,pra.url_type as urlType,CAST(pra.created_at AS timestamp) as >> prCreated, CAST(pra.updated_at AS timestamp) as prUpdated, >> CAST(ml.created_at AS timestamp) as mlCreated, CAST(ml.updated_at AS >> timestamp) as mlUpdated, a.name as status, pra.public_record_id as prId, >> acl.accesstime as crawledon, pra.id as propId, pra.primary_listing_id as >> listingId, datediff(CAST(acl.accesstime AS timestamp),CAST(ml.created_at AS >> timestamp)) as mlcreateage, datediff(CAST(acl.accesstime AS >> timestamp),CAST(ml.updated_at AS timestamp)) as mlupdateage, >> datediff(CAST(acl.accesstime AS timestamp),CAST(pra.created_at AS >> timestamp)) as prcreateage, datediff(CAST(acl.accesstime AS >> timestamp),CAST(pra.updated_at AS timestamp)) as prupdateage, (case when >> (pra.public_record_id is not null and TRIM(pra.public_record_id) <> '') >> then (case when (pra.primary_listing_id is null or >> TRIM(pra.primary_listing_id) = '') then 'PR' else 'PRMLS' END) else (case >> when (pra.primary_listing_id is not null and TRIM(pra.primary_listing_id) >> <> '') then 'MLS' else 'UNKNOWN' END) END) as listingType, >> acl.httpstatuscode, acl.httpverb, acl.requesttime, >> acl.upstreamheadertime , acl.upstreamresponsetime, acl.page_id, useragent >> AS user_agent, substring(split(pra.url,'/')[0], >> 0,length(split(pra.url,'/')[0])-3) as city, >> substring(split(pra.url,'/')[0], length(split(pra.url,'/')[0])-1,2) as >> state, ml.mls_id FROM access_logs.loadbalancer_accesslogs acl inner >> join mls_public_record_association_snapshot_orc pra on acl.listing_url = >> pra.url left outer join mls_listing_snapshot_orc ml on >> pra.primary_listing_id = ml.id left outer join attribute a on a.id = >> ml.standard_status WHERE acl.accesstimedate="2016-10-23"; >> >> i finally end up getting below error, >> >> 2016-10-25 17:40:18,725 Stage-2 map = 100%, reduce = 52%, Cumulative CPU >> 1478.96 sec >> 2016-10-25 17:40:19,761 Stage-2 map = 100%, reduce = 62%, Cumulative CPU >> 1636.58 sec >> 2016-10-25 17:40:20,794 Stage-2 map = 100%, reduce = 64%, Cumulative CPU >> 1764.97 sec >> 2016-10-25 17:40:21,820 Stage-2 map = 100%, reduce = 69%, Cumulative CPU >> 1879.61 sec >> 2016-10-25 17:40:22,842 Stage-2 map = 100%, reduce = 80%, Cumulative CPU >> 2051.38 sec >> 2016-10-25 17:40:23,872 Stage-2 map = 100%, reduce = 90%, Cumulative CPU >> 2151.49 sec >> 2016-10-25 17:40:24,907 Stage-2 map = 100%, reduce = 93%, Cumulative CPU >> 2179.67 sec >> 2016-10-25 17:40:25,944 Stage-2 map = 100%, reduce = 94%, Cumulative CPU >> 2187.86 sec >> 2016-10-25 17:40:29,062 Stage-2 map = 100%, reduce = 95%, Cumulative CPU >> 2205.22 sec >> 2016-10-25 17:40:30,107 Stage-2 map = 100%, reduce = 100%, Cumulative >> CPU 2241.25 sec >> MapReduce Total cumulative CPU time: 37 minutes 21 seconds 250 msec >> Ended Job = job_1477437520637_0009 >> SLF4J: Class path contains multiple SLF4J bindings. >> SLF4J: Found binding in [jar:file:/opt/apache-hive-2.1 >> .0-bin/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/Static >> LoggerBinder.class] >> SLF4J: Found binding in [jar:file:/opt/hadoop-2.7.2/sh >> are/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/im >> pl/StaticLoggerBinder.class] >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an >> explanation. >> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4 >> jLoggerFactory] >> 2016-10-25 17:40:35Starting to launch local task to process map join;maximum >> memory = 514850816 >> Execution failed with exit status: 2 >> Obtaining error information >> >> Task failed! >> Task ID: >> Stage-14 >> >> Logs: >> >> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec >> .mr.MapredLocalTask >> MapReduce Jobs Launched: >> Stage-Stage-1: Map: 106 Reduce: 45 Cumulative CPU: 3390.11 sec HDFS >> Read: 8060555201 HDFS Write: 757253756 SUCCESS >> Stage-Stage-2: Map: 204 Reduce: 85 Cumulative CPU: 2241.25 sec HDFS >> Read: 2407914653 HDFS Write: 805874953 SUCCESS >> Total MapReduce CPU Time Spent: 0 days 1 hours 33 minutes 51 seconds 360 >> msec >> >> Could not find any errors in logs, but when i check namenode logs , oi >> get the following error, >> >> 2016-10-25 17:01:51,923 INFO org.apache.hadoop.ipc.Server: IPC Server >> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco >> l.ClientProtocol.getBlockLocations from 192.168.120.133:47114 Call#4 >> Retry#0: java.io.FileNotFoundException: File does not exist: >> /user/hive/warehouse/mls_public_record_association_snapshot_ >> orc/delta_0000002_0000002_0000/bucket_00094_flush_length >> 2016-10-25 17:01:52,779 INFO org.apache.hadoop.ipc.Server: IPC Server >> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco >> l.ClientProtocol.getBlockLocations from 192.168.120.132:43008 Call#4 >> Retry#0: java.io.FileNotFoundException: File does not exist: >> /user/hive/warehouse/mls_public_record_association_snapshot_ >> orc/delta_0000002_0000002_0000/bucket_00095_flush_length >> 2016-10-25 17:01:52,984 INFO org.apache.hadoop.ipc.Server: IPC Server >> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco >> l.ClientProtocol.getBlockLocations from 192.168.120.133:47260 Call#4 >> Retry#0: java.io.FileNotFoundException: File does not exist: >> /user/hive/warehouse/mls_public_record_association_snapshot_ >> orc/delta_0000002_0000002_0000/bucket_00096_flush_length >> 2016-10-25 17:01:53,381 INFO org.apache.hadoop.ipc.Server: IPC Server >> handler 0 on 9000, call org.apache.hadoop.hdfs.protoco >> l.ClientProtocol.getBlockLocations from 192.168.120.132:43090 Call#4 >> Retry#0: java.io.FileNotFoundException: File does not exist: >> /user/hive/warehouse/mls_public_record_association_snapshot_ >> orc/delta_0000002_0000002_0000/bucket_00097_flush_length >> 2016-10-25 17:01:53,971 INFO org.apache.hadoop.ipc.Server: IPC Server >> handler 1 on 9000, call org.apache.hadoop.hdfs.protoco >> l.ClientProtocol.getBlockLocations from 192.168.120.134:37444 Call#4 >> Retry#0: java.io.FileNotFoundException: File does not exist: >> /user/hive/warehouse/mls_public_record_association_snapshot_ >> orc/delta_0000002_0000002_0000/bucket_00098_flush_length >> 2016-10-25 17:01:54,092 INFO org.apache.hadoop.ipc.Server: IPC Server >> handler 2 on 9000, call org.apache.hadoop.hdfs.protoco >> l.ClientProtocol.getBlockLocations from 192.168.120.133:47300 Call#4 >> Retry#0: java.io.FileNotFoundException: File does not exist: >> /user/hive/warehouse/mls_public_record_association_snapshot_ >> orc/delta_0000002_0000002_0000/bucket_00099_flush_length >> 2016-10-25 17:01:55,094 INFO org.apache.hadoop.ipc.Server: IPC Server >> handler 8 on 9000, call org.apache.hadoop.hdfs.protoco >> l.ClientProtocol.getBlockLocations from 192.168.120.134:37540 Call#4 >> Retry#0: java.io.FileNotFoundException: File does not exist: >> /user/hive/warehouse/mls_public_record_association_snapshot_ >> orc/delta_0000002_0000002_0000/bucket_00012_flush_length >> 2016-10-25 17:02:11,269 INFO org.apache.hadoop.ipc.Server: IPC Server >> handler 5 on 9000, call org.apache.hadoop.hdfs.protoco >> l.ClientProtocol.getBlockLocations from 192.168.120.133:47378 Call#4 >> Retry#0: java.io.FileNotFoundException: File does not exist: >> /user/hive/warehouse/mls_public_record_association_snapshot_ >> orc/delta_0000002_0000002_0000/bucket_00075_flush_length >> >> i also search for find the flush_length files in the above mentioned >> location, but i only see buckets but no files ending with flush_length. >> >> Any clue or help would be highly appreciated. >> >> Regards, >> Satyajit. >> >> >
