Hi, I have a very weird issue with my PIG script. Following is the content of my script
*REGISTER /home/hadoopuser/Workspace/lib/piggybank.jar* *REGISTER /home/hadoopuser/Workspace/lib/datafu.jar;* *REGISTER /opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hbase/hbase-0.94.2-cdh4.2.1-security.jar; * *REGISTER /opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/zookeeper/zookeeper-3.4.5-cdh4.2.1.jar; * *SET default_parallel 15;* *records = LOAD 'hbase://dm-re' USING org.apache.pig.backend.hadoop.hbase.HBaseStorage('v:ctm v:src','-caching 5000 -gt 1366098805& -lt 1366102543&') as (time:chararray,company:chararray);* *records_iso = FOREACH records GENERATE org.apache.pig.piggybank.evaluation.datetime.convert.CustomFormatToISO(time,'yyyy-MM-dd HH:mm:ss Z') as iso_time;* *records_group = GROUP records_iso ALL;* *result = FOREACH records_group GENERATE MAX(records_iso.iso_time) as maxtime;* *DUMP result* When i try to run this script in cluster of 5 nodes with 20 map slots, most of the map tasks fail with the following error after 10 mins of initializing, *Task attempt <id> failed to report status for 600 seconds. Killing!* I tried to decrease the caching size to less than 100 or so, (under the intuition that maybe fetching and processing more cache is taking more time) but still the same issue. However if i manage to load the rows (using lt and gt) such that number of map tasks are <=2, the job will be successfully finished. When the number of tasks is > 2 , it is always the case that 2-4 tasks are completed and the rest all fail with the above mentioned error. I attach the task tracker log hereby for this attempt. I don't see any error except for some zookeeper connection warnings. I manually checked from that node and doing a 'hbase zkcli' connects without any issue. Hence, I assume that zookeeper is configured properly. I don't really understand where to debug this problem. It would be great if someone could provide assistance. Some configurations of the cluster, which i think maybe relevant here, *dfs.block.size = 1 GB io.sort.mb = 1 GB HRegion size = 1 GB * and the size of the hbase table is close to 250 GB. I have observed 100% cpu usage by the mapred user on the node, while the task is under execution. I am not really sure, what to optimize in this case for the job to complete. It would be good if someone can throw some light in this direction. PS: All my nodes in the cluster are configured on a EBS backed amazon ec2 cluster. -- Regards, Praveen Bysani http://www.praveenbysani.com
Task Logs: 'attempt_201305081039_0028_m_000010_0' stdout logs stderr logs syslog logs 2013-05-13 06:29:17,218 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 2013-05-13 06:29:18,921 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/jars/job.jar <- /ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/attempt_201305081039_0028_m_000010_0/work/job.jar 2013-05-13 06:29:18,954 INFO org.apache.hadoop.filecache.TrackerDistributedCacheManager: Creating symlink: /ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/jars/.job.jar.crc <- /ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/attempt_201305081039_0028_m_000010_0/work/.job.jar.crc 2013-05-13 06:29:19,076 WARN org.apache.hadoop.conf.Configuration: session.id is deprecated. Instead, use dfs.metrics.session-id 2013-05-13 06:29:19,077 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId= 2013-05-13 06:29:19,923 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0 2013-05-13 06:29:19,967 INFO org.apache.hadoop.mapred.Task: Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@5a92668c 2013-05-13 06:29:20,345 INFO org.apache.hadoop.mapred.MapTask: Processing split: Number of splits :1 Total Length = 0 Input split[0]: Length = 0 Locations: ip-10-122-3-220.ap-northeast-1.compute.internal ----------------------- 2013-05-13 06:29:20,777 INFO org.apache.zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-cdh4.2.1--1, built on 04/22/2013 03:46 GMT 2013-05-13 06:29:20,777 INFO org.apache.zookeeper.ZooKeeper: Client environment:host.name=ip-10-122-3-220.ap-northeast-1.compute.internal 2013-05-13 06:29:20,777 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.version=1.6.0_31 2013-05-13 06:29:20,777 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.vendor=Sun Microsystems Inc. 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/j2sdk1.6-oracle/jre 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.class.path=/run/cloudera-scm-agent/process/215-mapreduce-TASKTRACKER:/usr/lib/jvm/j2sdk1.6-oracle/lib/tools.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/hadoop-core-2.0.0-mr1-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/activation-1.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/ant-contrib-1.0b3.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/asm-3.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/aspectjrt-1.6.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/aspectjtools-1.6.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/avro-1.7.3.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/avro-compiler-1.7.3.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-1.7.0.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-beanutils-core-1.8.0.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-collections-3.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-configuration-1.6.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-digester-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-el-1.0.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-httpclient-3.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-io-2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-lang-2.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-logging-1.1.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-math-2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/commons-net-3.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/hadoop-fairscheduler-2.0.0-mr1-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/hsqldb-1.8.0.10.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jackson-jaxrs-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jackson-xc-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jasper-compiler-5.5.23.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jasper-runtime-5.5.23.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jaxb-api-2.2.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jaxb-impl-2.2.3-1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jersey-core-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jersey-json-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jersey-server-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jets3t-0.6.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jettison-1.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jetty-6.1.26.cloudera.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jetty-util-6.1.26.cloudera.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jline-0.9.94.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jsch-0.1.42.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jsp-api-2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jsr305-1.3.9.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/junit-4.8.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/kfs-0.2.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/kfs-0.3.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/mockito-all-1.8.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/paranamer-2.3.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/protobuf-java-2.4.0a.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/slf4j-api-1.6.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/snappy-java-1.0.4.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/stax-api-1.0.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/xmlenc-0.52.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/zookeeper-3.4.5-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jsp-2.1/jsp-2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/jsp-2.1/jsp-api-2.1.jar:/usr/share/cmf/lib/plugins/navigator-plugin-4.5.2-shaded.jar:/usr/share/cmf/lib/plugins/tt-instrumentation-4.5.2.jar:/usr/share/cmf/lib/plugins/event-publish-4.5.2-shaded.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/commons-io-2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/commons-logging-1.1.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/xmlenc-0.52.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jasper-runtime-5.5.23.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/zookeeper-3.4.5-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jline-0.9.94.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jsp-api-2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/asm-3.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/commons-daemon-1.0.3.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/protobuf-java-2.4.0a.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/commons-el-1.0.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jetty-util-6.1.26.cloudera.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jersey-core-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jersey-server-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jetty-6.1.26.cloudera.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/jsr305-1.3.9.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/lib/commons-lang-2.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/hadoop-hdfs.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.2.1-tests.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jackson-mapper-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-io-2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-logging-1.1.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/xmlenc-0.52.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jasper-runtime-5.5.23.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/servlet-api-2.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jackson-jaxrs-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/zookeeper-3.4.5-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jline-0.9.94.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-collections-3.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-httpclient-3.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jsp-api-2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/slf4j-log4j12-1.6.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-net-3.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-cli-1.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-configuration-1.6.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-digester-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jasper-compiler-5.5.23.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-beanutils-1.7.0.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/snappy-java-1.0.4.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jaxb-impl-2.2.3-1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/slf4j-api-1.6.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/paranamer-2.3.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/asm-3.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/junit-4.8.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-beanutils-core-1.8.0.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jersey-json-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-math-2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/log4j-1.2.17.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/protobuf-java-2.4.0a.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-el-1.0.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/mockito-all-1.8.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jettison-1.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/activation-1.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jsch-0.1.42.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jetty-util-6.1.26.cloudera.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jersey-core-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/hue-plugins-2.2.0-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/avro-1.7.3.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jersey-server-1.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jaxb-api-2.2.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jackson-xc-1.8.8.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/stax-api-1.0.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-codec-1.4.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jets3t-0.6.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jetty-6.1.26.cloudera.2.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/jsr305-1.3.9.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/kfs-0.3.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/lib/commons-lang-2.5.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/hadoop-common-2.0.0-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/hadoop-auth.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/hadoop-auth-2.0.0-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/hadoop-common-2.0.0-cdh4.2.1-tests.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/hadoop-annotations-2.0.0-cdh4.2.1.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/hadoop-common.jar:/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop/hadoop-annotations.jar:/ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/jars/classes:/ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/jars/job.jar:/ebs/mapred/local/taskTracker/distcache/-3456193334429412951_402957723_479460206/ip-10-122-5-26.ap-northeast-1.compute.internal/tmp/temp-651112871/tmp869044128/piggybank.jar:/ebs/mapred/local/taskTracker/distcache/-7337653544631754240_-1468589137_479460237/ip-10-122-5-26.ap-northeast-1.compute.internal/tmp/temp-651112871/tmp1875426668/datafu.jar:/ebs/mapred/local/taskTracker/distcache/-1402638602126841415_-1435345198_479460330/ip-10-122-5-26.ap-northeast-1.compute.internal/tmp/temp-651112871/tmp-1838925944/hbase-0.94.2-cdh4.2.1-security.jar:/ebs/mapred/local/taskTracker/distcache/7452800711435023543_2121443610_479460382/ip-10-122-5-26.ap-northeast-1.compute.internal/tmp/temp-651112871/tmp-1661097048/zookeeper-3.4.5-cdh4.2.1.jar:/ebs/mapred/local/taskTracker/hadoopuser/distcache/-6387295021460789984_342618622_479466414/ip-10-122-5-26.ap-northeast-1.compute.internal/user/hadoopuser/.staging/job_201305081039_0028/libjars/zookeeper-3.4.5-cdh4.2.1.jar:/ebs/mapred/local/taskTracker/hadoopuser/distcache/-4928843957969028386_439824881_479466462/ip-10-122-5-26.ap-northeast-1.compute.internal/user/hadoopuser/.staging/job_201305081039_0028/libjars/guava-11.0.2.jar:/ebs/mapred/local/taskTracker/hadoopuser/distcache/5720329722789448786_1097322137_479466543/ip-10-122-5-26.ap-northeast-1.compute.internal/user/hadoopuser/.staging/job_201305081039_0028/libjars/hbase-0.94.2-cdh4.2.1-security.jar:/ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/attempt_201305081039_0028_m_000010_0/work 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/opt/cloudera/parcels/CDH-4.2.1-1.cdh4.2.1.p0.5/lib/hadoop-0.20-mapreduce/lib/native/Linux-amd64-64:/ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/attempt_201305081039_0028_m_000010_0/work 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/attempt_201305081039_0028_m_000010_0/work/tmp 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA> 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=3.2.0-36-virtual 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=mapred 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/var/lib/hadoop-mapreduce 2013-05-13 06:29:20,778 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/ebs/mapred/local/taskTracker/hadoopuser/jobcache/job_201305081039_0028/attempt_201305081039_0028_m_000010_0/work 2013-05-13 06:29:20,779 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=ip-10-122-5-26.ap-northeast-1.compute.internal:2181 sessionTimeout=60000 watcher=hconnection 2013-05-13 06:29:20,879 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: The identifier of this process is 22865@ip-10-122-3-220 2013-05-13 06:29:20,891 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ip-10-122-5-26.ap-northeast-1.compute.internal/10.122.5.26:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2013-05-13 06:29:20,896 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ip-10-122-5-26.ap-northeast-1.compute.internal/10.122.5.26:2181, initiating session 2013-05-13 06:29:20,914 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server ip-10-122-5-26.ap-northeast-1.compute.internal/10.122.5.26:2181, sessionid = 0x13e83a6bd97045c, negotiated timeout = 60000 2013-05-13 06:29:21,328 WARN org.apache.hadoop.conf.Configuration: hadoop.native.lib is deprecated. Instead, use io.native.lib.available 2013-05-13 06:29:21,705 INFO org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat: setScan with ranges: 1608144425504356317089131618754481167815455570216331582650421 - 1608144425504356338779411601026123924953063215065535954039604 ( 21690279982271642757137607644849204371389183) 2013-05-13 06:29:21,776 INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader: Current split being processed ip-10-122-3-220.ap-northeast-1.compute.internal:1358368301&crown&5222885,1358371115&crown&5601434 2013-05-13 06:29:21,788 INFO org.apache.hadoop.mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer 2013-05-13 06:29:21,795 INFO org.apache.hadoop.mapred.MapTask: io.sort.mb = 1024 2013-05-13 06:29:24,102 INFO org.apache.hadoop.mapred.MapTask: data buffer = 816043776/1020054736 2013-05-13 06:29:24,102 INFO org.apache.hadoop.mapred.MapTask: record buffer = 2684354/3355443 2013-05-13 06:29:24,118 INFO org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call- Usage threshold init = 163708928(159872K) used = 1020054752(996147K) committed = 1073741824(1048576K) max = 1073741824(1048576K) 2013-05-13 06:29:24,123 INFO org.apache.pig.impl.util.SpillableMemoryManager: first memory handler call - Collection threshold init = 163708928(159872K) used = 1022801224(998829K) committed = 1073741824(1048576K) max = 1073741824(1048576K) 2013-05-13 06:29:24,137 WARN org.apache.hadoop.conf.Configuration: dfs.df.interval is deprecated. Instead, use fs.df.interval 2013-05-13 06:29:24,138 WARN org.apache.hadoop.conf.Configuration: dfs.max.objects is deprecated. Instead, use dfs.namenode.max.objects 2013-05-13 06:29:24,138 WARN org.apache.hadoop.conf.Configuration: dfs.data.dir is deprecated. Instead, use dfs.datanode.data.dir 2013-05-13 06:29:24,138 WARN org.apache.hadoop.conf.Configuration: dfs.name.dir is deprecated. Instead, use dfs.namenode.name.dir 2013-05-13 06:29:24,138 WARN org.apache.hadoop.conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS 2013-05-13 06:29:24,138 WARN org.apache.hadoop.conf.Configuration: fs.checkpoint.dir is deprecated. Instead, use dfs.namenode.checkpoint.dir 2013-05-13 06:29:24,138 WARN org.apache.hadoop.conf.Configuration: dfs.block.size is deprecated. Instead, use dfs.blocksize 2013-05-13 06:29:24,138 WARN org.apache.hadoop.conf.Configuration: dfs.access.time.precision is deprecated. Instead, use dfs.namenode.accesstime.precision 2013-05-13 06:29:24,138 WARN org.apache.hadoop.conf.Configuration: dfs.replication.min is deprecated. Instead, use dfs.namenode.replication.min 2013-05-13 06:29:24,138 WARN org.apache.hadoop.conf.Configuration: dfs.name.edits.dir is deprecated. Instead, use dfs.namenode.edits.dir 2013-05-13 06:29:24,139 WARN org.apache.hadoop.conf.Configuration: dfs.replication.considerLoad is deprecated. Instead, use dfs.namenode.replication.considerLoad 2013-05-13 06:29:24,139 WARN org.apache.hadoop.conf.Configuration: dfs.balance.bandwidthPerSec is deprecated. Instead, use dfs.datanode.balance.bandwidthPerSec 2013-05-13 06:29:24,139 WARN org.apache.hadoop.conf.Configuration: dfs.safemode.threshold.pct is deprecated. Instead, use dfs.namenode.safemode.threshold-pct 2013-05-13 06:29:24,139 WARN org.apache.hadoop.conf.Configuration: dfs.http.address is deprecated. Instead, use dfs.namenode.http-address 2013-05-13 06:29:24,139 WARN org.apache.hadoop.conf.Configuration: dfs.name.dir.restore is deprecated. Instead, use dfs.namenode.name.dir.restore 2013-05-13 06:29:24,139 WARN org.apache.hadoop.conf.Configuration: dfs.https.client.keystore.resource is deprecated. Instead, use dfs.client.https.keystore.resource 2013-05-13 06:29:24,139 WARN org.apache.hadoop.conf.Configuration: dfs.backup.address is deprecated. Instead, use dfs.namenode.backup.address 2013-05-13 06:29:24,139 WARN org.apache.hadoop.conf.Configuration: dfs.backup.http.address is deprecated. Instead, use dfs.namenode.backup.http-address 2013-05-13 06:29:24,139 WARN org.apache.hadoop.conf.Configuration: dfs.permissions is deprecated. Instead, use dfs.permissions.enabled 2013-05-13 06:29:24,140 WARN org.apache.hadoop.conf.Configuration: dfs.safemode.extension is deprecated. Instead, use dfs.namenode.safemode.extension 2013-05-13 06:29:24,140 WARN org.apache.hadoop.conf.Configuration: dfs.datanode.max.xcievers is deprecated. Instead, use dfs.datanode.max.transfer.threads 2013-05-13 06:29:24,140 WARN org.apache.hadoop.conf.Configuration: dfs.https.need.client.auth is deprecated. Instead, use dfs.client.https.need-auth 2013-05-13 06:29:24,140 WARN org.apache.hadoop.conf.Configuration: dfs.https.address is deprecated. Instead, use dfs.namenode.https-address 2013-05-13 06:29:24,140 WARN org.apache.hadoop.conf.Configuration: dfs.replication.interval is deprecated. Instead, use dfs.namenode.replication.interval 2013-05-13 06:29:24,140 WARN org.apache.hadoop.conf.Configuration: fs.checkpoint.edits.dir is deprecated. Instead, use dfs.namenode.checkpoint.edits.dir 2013-05-13 06:29:24,140 WARN org.apache.hadoop.conf.Configuration: dfs.write.packet.size is deprecated. Instead, use dfs.client-write-packet-size 2013-05-13 06:29:24,140 WARN org.apache.hadoop.conf.Configuration: dfs.permissions.supergroup is deprecated. Instead, use dfs.permissions.superusergroup 2013-05-13 06:29:24,140 WARN org.apache.hadoop.conf.Configuration: topology.script.number.args is deprecated. Instead, use net.topology.script.number.args 2013-05-13 06:29:24,141 WARN org.apache.hadoop.conf.Configuration: dfs.umaskmode is deprecated. Instead, use fs.permissions.umask-mode 2013-05-13 06:29:24,141 WARN org.apache.hadoop.conf.Configuration: dfs.secondary.http.address is deprecated. Instead, use dfs.namenode.secondary.http-address 2013-05-13 06:29:24,141 WARN org.apache.hadoop.conf.Configuration: fs.checkpoint.period is deprecated. Instead, use dfs.namenode.checkpoint.period 2013-05-13 06:29:24,141 WARN org.apache.hadoop.conf.Configuration: topology.node.switch.mapping.impl is deprecated. Instead, use net.topology.node.switch.mapping.impl 2013-05-13 06:29:24,141 WARN org.apache.hadoop.conf.Configuration: io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum 2013-05-13 06:29:24,207 INFO org.apache.pig.data.SchemaTupleBackend: Key [pig.schematuple] was not set... will not generate code. 2013-05-13 06:29:24,678 INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigGenericMapReduce$Map: Aliases being processed per job phase (AliasName[line,offset]): M: bet_records[7,14],bet_records_iso[-1,-1],bet_result[-1,-1],bet_records_group[11,20] C: bet_result[-1,-1],bet_records_group[11,20] R: bet_result[-1,-1] 2013-05-13 06:30:02,971 INFO org.apache.hadoop.mapred.MapTask: Starting flush of map output 2013-05-13 06:30:03,147 INFO org.apache.hadoop.io.compress.CodecPool: Got brand-new compressor [.snappy] 2013-05-13 06:30:03,394 INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigCombiner$Combine: Aliases being processed per job phase (AliasName[line,offset]): M: bet_records[7,14],bet_records_iso[-1,-1],bet_result[-1,-1],bet_records_group[11,20] C: bet_result[-1,-1],bet_records_group[11,20] R: bet_result[-1,-1] 2013-05-13 06:37:41,107 INFO org.apache.zookeeper.ClientCnxn: Unable to read additional data from server sessionid 0x13e83a6bd97045c, likely server has closed socket, closing socket connection and attempting reconnect 2013-05-13 06:38:13,251 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server ip-10-122-5-26.ap-northeast-1.compute.internal/10.122.5.26:2181. Will not attempt to authenticate using SASL (Unable to locate a login configuration) 2013-05-13 06:38:13,252 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to ip-10-122-5-26.ap-northeast-1.compute.internal/10.122.5.26:2181, initiating session