Problem with HBase loader 0.90.3 and PIG 0.8.1
----------------------------------------------
Key: PIG-2193
URL: https://issues.apache.org/jira/browse/PIG-2193
Project: Pig
Issue Type: Bug
Affects Versions: 0.8.1
Environment: HBase 0.90.3, Hadoop 0.20-append
Reporter: Vincent BARAT
I've some data in HBase 0.90.3 and I run a simple script on them.
This script badly returns 0 records. From time to time, under yet undefined
conditions, the same script on the same data works (it return correct data).
When data are loaded from HDFS instead of HBase, the script runs perfectly.
Here is the script loading from HDFS (works):
start_sessions = LOAD 'start_sessions' AS (sid:chararray, infoid:chararray,
imei:chararray, start:long);
end_sessions = LOAD 'end_sessions' AS (sid:chararray, end:long,
locid:chararray);
infos = LOAD 'infos' AS (infoid:chararray, network_type:chararray,
network_subtype:chararray, locale:chararray, version_name:chararray,
carrier_country:chararray, carrier_name:chararray,
phone_manufacturer:chararray, phone_model:chararray,
firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;
The same script loading from HBase (don't work):
start_sessions = LOAD 'startSession' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid
meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray,
start:long);
end_sessions = LOAD 'endSession' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp
meta:locid') AS (sid:chararray, end:long, locid:chararray);
infos = LOAD 'info' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType
data:networkSubtype data:locale data:applicationVersionName data:carrierCountry
data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion
data:firmwareName') AS (infoid:chararray, network_type:chararray,
network_subtype:chararray, locale:chararray, version_name:chararray,
carrier_country:chararray, carrier_name:chararray,
phone_manufacturer:chararray, phone_model:chararray,
firmware_version:chararray, firmware_name:chararray);
sessions = JOIN start_sessions BY sid, end_sessions BY sid;
sessions = FILTER sessions BY end > start AND end - start < 86400000L;
sessions = JOIN sessions BY infoid, infos BY infoid;
sessions = LIMIT sessions 100;
dump sessions;
I guess it definitively means there is a nasty bug in the HBase loader.
Here is the PIG dump for the HBase version:
aws09:~# pig
2011-07-28 08:17:36,329 [main] INFO org.apache.pig.Main - Logging error
messages to: /root/pig_1311841056328.log
2011-07-28 08:17:36,641 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to
hadoop file system at: hdfs://aws09.preprod.ubithere.com:9000
2011-07-28 08:17:36,923 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to
map-reduce job tracker at: aws09.preprod.ubithere.com:9001
grunt> start_sessions = LOAD 'startSession.mde253811.preprod.ubithere.com'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:infoid
meta:imei meta:timestamp') AS (sid:chararray, infoid:chararray, imei:chararray,
start:long);
grunt> end_sessions = LOAD 'endSession.mde253811.preprod.ubithere.com' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:sid meta:timestamp
meta:locid') AS (sid:chararray, end:long, locid:chararray);
grunt> infos = LOAD 'info.mde253811.preprod.ubithere.com' USING
org.apache.pig.backend.hadoop.hbase.HBaseStorage('meta:infoid data:networkType
data:networkSubtype data:locale data:applicationVersionName data:carrierCountry
data:carrierName data:phoneManufacturer data:phoneModel data:firmwareVersion
data:firmwareName') AS (infoid:chararray, network_type:chararray,
network_subtype:chararray, locale:chararray, version_name:chararray,
carrier_country:chararray, carrier_name:chararray,
phone_manufacturer:chararray, phone_model:chararray,
firmware_version:chararray, firmware_name:chararray);
grunt> sessions = JOIN start_sessions BY sid, end_sessions BY sid;
grunt> sessions = FILTER sessions BY end > start AND end - start < 86400000L;
grunt> sessions = JOIN sessions BY infoid, infos BY infoid;
grunt> sessions = LIMIT sessions 100;
grunt> dump sessions;
2011-07-28 08:17:50,275 [main] INFO org.apache.pig.tools.pigstats.ScriptState
- Pig features used in the script: HASH_JOIN,FILTER,LIMIT
2011-07-28 08:17:50,275 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
pig.usenewlogicalplan is set to true. New logical plan will be used.
2011-07-28 08:17:51,213 [main] INFO
org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - (Name:
sessions:
Store(hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740:org.apache.pig.impl.io.InterStorage)
- scope-93 Operator Key: scope-93)
2011-07-28 08:17:51,225 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File
concatenation threshold: 100 optimistic? false
2011-07-28 08:17:51,281 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer
- Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,281 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer
- Rewrite: POPackage->POForEach to POJoinPackage
2011-07-28 08:17:51,350 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size before optimization: 3
2011-07-28 08:17:51,350 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
- MR plan size after optimization: 3
2011-07-28 08:17:51,402 [main] INFO org.apache.pig.tools.pigstats.ScriptState
- Pig script settings are added to the job
2011-07-28 08:17:51,411 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:zookeeper.version=3.3.2-1031432, built on 11/05/2010 05:32 GMT
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:host.name=aws09.machine.com
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:java.version=1.6.0_22
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:java.vendor=Sun Microsystems Inc.
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:java.home=/usr/lib/jvm/java-6-sun-1.6.0.22/jre
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:java.class.path=/opt/pig/bin/../conf:/usr/lib/jvm/java-6-sun/jre/lib/tools.jar:/opt/pig/bin/../pig-0.8.1-core.jar:/opt/pig/bin/../build/pig-*-SNAPSHOT.jar:/opt/pig/bin/../lib/commons-el-1.0.jar:/opt/pig/bin/../lib/commons-lang-2.4.jar:/opt/pig/bin/../lib/commons-logging-1.1.1.jar:/opt/pig/bin/../lib/guava-r06.jar:/opt/pig/bin/../lib/hbase-0.90.3.jar:/opt/pig/bin/../lib/hsqldb-1.8.0.10.jar:/opt/pig/bin/../lib/jackson-core-asl-1.0.1.jar:/opt/pig/bin/../lib/jackson-mapper-asl-1.0.1.jar:/opt/pig/bin/../lib/javacc-4.2.jar:/opt/pig/bin/../lib/javacc.jar:/opt/pig/bin/../lib/jetty-util-6.1.14.jar:/opt/pig/bin/../lib/jline-0.9.94.jar:/opt/pig/bin/../lib/joda-time-1.6.jar:/opt/pig/bin/../lib/jsch-0.1.38.jar:/opt/pig/bin/../lib/junit-4.5.jar:/opt/pig/bin/../lib/jython-2.5.0.jar:/opt/pig/bin/../lib/log4j-1.2.14.jar:/opt/pig/bin/../lib/pigudfs.jar:/opt/pig/bin/../lib/slf4j-log4j12-1.4.3.jar:/opt/pig/bin/../lib/zookeeper-3.3.2.jar:/opt/hadoop/conf_computation:/opt/hbase/conf:/opt/pig/lib/hadoop-0.20-append-core.jar
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:java.library.path=/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64/server:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/lib/amd64:/usr/lib/jvm/java-6-sun-1.6.0.22/jre/../lib/amd64:/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:java.io.tmpdir=/tmp
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:java.compiler=<NA>
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:os.name=Linux
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:os.arch=amd64
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:os.version=2.6.21.7-2.fc8xen-ec2-v1.0
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:user.name=root
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:user.home=/root
2011-07-28 08:17:51,470 [main] INFO org.apache.zookeeper.ZooKeeper - Client
environment:user.dir=/root
2011-07-28 08:17:51,471 [main] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=aws09.machine.com:2222
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:51,493 [main-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:51,499 [main-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:51,508 [main-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on server
aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054b, negotiated
timeout = 60000
2011-07-28 08:17:51,575 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@ef894ce;
hsa=aws03.machine.com:60020
2011-07-28 08:17:51,687 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:51,696 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner
- Scanning .META. starting at
row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:51,700 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for
endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is
aws03.machine.com:60020
2011-07-28 08:17:51,726 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner
- Scanning .META. starting at
row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:51,729 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for
startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is
aws03.machine.com:60020
2011-07-28 08:17:53,328 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-07-28 08:17:53,335 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=0
2011-07-28 08:17:53,335 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Neither PARALLEL nor default parallelism is set for this job. Setting number
of reducers to 1
2011-07-28 08:17:53,442 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-07-28 08:17:53,944 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
2011-07-28 08:17:53,989 [Thread-13] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=aws09.machine.com:2222
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:53,990 [Thread-13-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:53,991 [Thread-13-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:53,996 [Thread-13-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on server
aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054c, negotiated
timeout = 60000
2011-07-28 08:17:54,000 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@2d9f90e3;
hsa=aws03.machine.com:60020
2011-07-28 08:17:54,005 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,006 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,011 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for
endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is
aws03.machine.com:60020
2011-07-28 08:17:54,017 [Thread-13] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=aws09.machine.com:2222
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,017 [Thread-13-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,018 [Thread-13-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,025 [Thread-13-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on server
aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054d, negotiated
timeout = 60000
2011-07-28 08:17:54,029 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@41f6321;
hsa=aws03.machine.com:60020
2011-07-28 08:17:54,032 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,033 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,037 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for
endSession.mde253811.preprod.ubithere.com,,1311086199483.706685579 is
aws03.machine.com:60020
2011-07-28 08:17:54,039 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=endSession.mde253811.preprod.ubithere.com,,00000000000000 for
max=2147483647 rows
2011-07-28 08:17:54,067 [Thread-13] DEBUG
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0
-> aws03.machine.com:,
2011-07-28 08:17:54,068 [Thread-13] INFO
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:17:54,068 [Thread-13] INFO
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:17:54,109 [Thread-13] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=aws09.machine.com:2222
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,110 [Thread-13-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,111 [Thread-13-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,119 [Thread-13-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on server
aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054e, negotiated
timeout = 60000
2011-07-28 08:17:54,123 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@20c3e967;
hsa=aws03.machine.com:60020
2011-07-28 08:17:54,140 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,142 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,148 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for
startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is
aws03.machine.com:60020
2011-07-28 08:17:54,154 [Thread-13] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=aws09.machine.com:2222
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:17:54,158 [Thread-13-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
aws09.machine.com/10.83.1.244:2222
2011-07-28 08:17:54,159 [Thread-13-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:17:54,161 [Thread-13-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on server
aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada6054f, negotiated
timeout = 60000
2011-07-28 08:17:54,164 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5ee771f3;
hsa=aws03.machine.com:60020
2011-07-28 08:17:54,167 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:17:54,169 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:17:54,172 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for
startSession.mde253811.preprod.ubithere.com,,1311086198252.1334391323 is
aws03.machine.com:60020
2011-07-28 08:17:54,173 [Thread-13] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=startSession.mde253811.preprod.ubithere.com,,00000000000000 for
max=2147483647 rows
2011-07-28 08:17:54,180 [Thread-13] DEBUG
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0
-> aws03.machine.com:,
2011-07-28 08:17:54,180 [Thread-13] INFO
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:17:54,180 [Thread-13] INFO
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:17:55,037 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201107251336_0314
2011-07-28 08:17:55,037 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0314
2011-07-28 08:19:06,924 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 8% complete
2011-07-28 08:19:15,971 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 16% complete
2011-07-28 08:19:18,985 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 19% complete
2011-07-28 08:19:25,035 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 27% complete
2011-07-28 08:20:14,810 [main] INFO org.apache.pig.tools.pigstats.ScriptState
- Pig script settings are added to the job
2011-07-28 08:20:14,812 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:20:14,830 [main] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=aws09.machine.com:2222
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:14,831 [main-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:14,832 [main-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:14,838 [main-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on server
aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60556, negotiated
timeout = 60000
2011-07-28 08:20:14,842 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@5d4fa79d;
hsa=aws03.machine.com:60020
2011-07-28 08:20:14,847 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:14,849 [main] DEBUG org.apache.hadoop.hbase.client.MetaScanner
- Scanning .META. starting at
row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:14,852 [main] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for
info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is
aws03.machine.com:60020
2011-07-28 08:20:16,311 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-07-28 08:20:16,324 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=198330658
2011-07-28 08:20:16,324 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Neither PARALLEL nor default parallelism is set for this job. Setting number
of reducers to 1
2011-07-28 08:20:16,341 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-07-28 08:20:16,656 [Thread-32] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to
process : 1
2011-07-28 08:20:16,656 [Thread-32] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths to process : 1
2011-07-28 08:20:16,693 [Thread-32] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=aws09.machine.com:2222
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:16,694 [Thread-32-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,695 [Thread-32-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,702 [Thread-32-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on server
aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60557, negotiated
timeout = 60000
2011-07-28 08:20:16,705 [Thread-32] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@37285252;
hsa=aws03.machine.com:60020
2011-07-28 08:20:16,709 [Thread-32] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,710 [Thread-32] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:16,714 [Thread-32] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for
info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is
aws03.machine.com:60020
2011-07-28 08:20:16,716 [Thread-32] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=aws09.machine.com:2222
sessionTimeout=60000 watcher=hconnection
2011-07-28 08:20:16,717 [Thread-32-SendThread()] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
aws09.machine.com/10.83.1.244:2222
2011-07-28 08:20:16,718 [Thread-32-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
aws09.machine.com/10.83.1.244:2222, initiating session
2011-07-28 08:20:16,720 [Thread-32-SendThread(aws09.machine.com:2222)] INFO
org.apache.zookeeper.ClientCnxn - Session establishment complete on server
aws09.machine.com/10.83.1.244:2222, sessionid = 0x131617dada60558, negotiated
timeout = 60000
2011-07-28 08:20:16,723 [Thread-32] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Lookedup root region location,
connection=org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation@7418e252;
hsa=aws03.machine.com:60020
2011-07-28 08:20:16,726 [Thread-32] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for .META.,,1.1028785192 is aws03.machine.com:60020
2011-07-28 08:20:16,727 [Thread-32] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=10 rows
2011-07-28 08:20:16,730 [Thread-32] DEBUG
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation -
Cached location for
info.mde253811.preprod.ubithere.com,,1311086202955.1975990008 is
aws03.machine.com:60020
2011-07-28 08:20:16,732 [Thread-32] DEBUG
org.apache.hadoop.hbase.client.MetaScanner - Scanning .META. starting at
row=info.mde253811.preprod.ubithere.com,,00000000000000 for max=2147483647 rows
2011-07-28 08:20:16,772 [Thread-32] DEBUG
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - getSplits: split -> 0
-> aws03.machine.com:,
2011-07-28 08:20:16,772 [Thread-32] INFO
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Got 1 splits.
2011-07-28 08:20:16,772 [Thread-32] INFO
org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat - Returning 1 splits.
2011-07-28 08:20:17,500 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201107251336_0315
2011-07-28 08:20:17,500 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0315
2011-07-28 08:20:28,075 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 37% complete
2011-07-28 08:20:34,106 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 41% complete
2011-07-28 08:20:37,124 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 50% complete
2011-07-28 08:20:46,168 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 51% complete
2011-07-28 08:20:49,183 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 61% complete
2011-07-28 08:20:52,198 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 62% complete
2011-07-28 08:20:55,214 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 64% complete
2011-07-28 08:21:01,244 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 66% complete
2011-07-28 08:21:07,311 [main] INFO org.apache.pig.tools.pigstats.ScriptState
- Pig script settings are added to the job
2011-07-28 08:21:07,312 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2011-07-28 08:21:08,770 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
2011-07-28 08:21:08,778 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
2011-07-28 08:21:08,910 [Thread-47] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to
process : 1
2011-07-28 08:21:08,910 [Thread-47] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths to process : 1
2011-07-28 08:21:08,911 [Thread-47] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths (combined) to process : 1
2011-07-28 08:21:09,280 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- HadoopJobId: job_201107251336_0316
2011-07-28 08:21:09,280 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- More information at:
http://aws09.preprod.ubithere.com:50030/jobdetails.jsp?jobid=job_201107251336_0316
2011-07-28 08:21:16,321 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 83% complete
2011-07-28 08:21:34,439 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2011-07-28 08:21:34,441 [main] INFO org.apache.pig.tools.pigstats.PigStats -
Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
0.20-append 0.8.1-SNAPSHOT root 2011-07-28 08:17:51 2011-07-28
08:21:34 HASH_JOIN,FILTER,LIMIT
Success!
Job Stats (time in seconds):
JobId Maps Reduces MaxMapTime MinMapTIme AvgMapTime
MaxReduceTime MinReduceTime AvgReduceTime Alias Feature Outputs
job_201107251336_0314 2 1 75 66 70 63 63 63
end_sessions,sessions,start_sessions HASH_JOIN
job_201107251336_0315 4 1 15 6 12 24 24 24
infos,sessions HASH_JOIN
job_201107251336_0316 1 1 3 3 3 12 12 12
hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740,
Input(s):
Successfully read 2069446 records from:
"endSession.mde253811.preprod.ubithere.com"
Successfully read 2072419 records from:
"startSession.mde253811.preprod.ubithere.com"
Successfully read 19441 records from: "info.mde253811.preprod.ubithere.com"
Output(s):
Successfully stored 0 records in:
"hdfs://aws09.preprod.ubithere.com:9000/tmp/temp-1404953096/tmp819396740"
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 1
Total records proactively spilled: 1944943
Job DAG:
job_201107251336_0314 -> job_201107251336_0315,
job_201107251336_0315 -> job_201107251336_0316,
job_201107251336_0316
2011-07-28 08:21:34,472 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
2011-07-28 08:21:34,500 [main] INFO
org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to
process : 1
2011-07-28 08:21:34,501 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths to process : 1
grunt>
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira