Sure, attached below the job counter values. I checked the final status of the job and it said succeeded. I could not see the import tool exactly because I ran it overnight and my machine rebooted at some point for some updates - I wonder if there is some post-processing after the MR job which might have failed due to this ?
Thanks for the help ! ---------------- Logged in as: dr.who Counters for job_1442389862209_0002 Application Job - Overview <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/job/job_1442389862209_0002> - Counters <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/jobcounters/job_1442389862209_0002> - Configuration <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/conf/job_1442389862209_0002> - Map tasks <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/tasks/job_1442389862209_0002/m> - Reduce tasks <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/tasks/job_1442389862209_0002/r> Tools Counter Group Counters File System Counters Name Map Reduce Total FILE: Number of bytes read <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_BYTES_READ> 1520770904675 2604849340144 4125620244819 FILE: Number of bytes written <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_BYTES_WRITTEN> 3031784709196 2616689890216 5648474599412 FILE: Number of large read operations <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_LARGE_READ_OPS> 0 0 0 FILE: Number of read operations <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_READ_OPS> 0 0 0 FILE: Number of write operations <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/FILE_WRITE_OPS> 0 0 0 WASB: Number of bytes read <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_BYTES_READ> 186405294283 0 186405294283 WASB: Number of bytes written <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_BYTES_WRITTEN> 0 363027342839 363027342839 WASB: Number of large read operations <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_LARGE_READ_OPS> 0 0 0 WASB: Number of read operations <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_READ_OPS> 0 0 0 WASB: Number of write operations <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.FileSystemCounter/WASB_WRITE_OPS> 0 0 0 Job Counters Name Map Reduce Total Launched map tasks <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/TOTAL_LAUNCHED_MAPS> 0 0 348 Launched reduce tasks <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/TOTAL_LAUNCHED_REDUCES> 0 0 9 Rack-local map tasks <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/RACK_LOCAL_MAPS> 0 0 348 Total megabyte-seconds taken by all map tasks <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/MB_MILLIS_MAPS> 0 0 460560315648 Total megabyte-seconds taken by all reduce tasks <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/MB_MILLIS_REDUCES> 0 0 158604449280 Total time spent by all map tasks (ms) <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/MILLIS_MAPS> 0 0 599687911 Total time spent by all maps in occupied slots (ms) <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/SLOTS_MILLIS_MAPS> 0 0 599687911 Total time spent by all reduce tasks (ms) <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/MILLIS_REDUCES> 0 0 103258105 Total time spent by all reduces in occupied slots (ms) <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/SLOTS_MILLIS_REDUCES> 0 0 206516210 Total vcore-seconds taken by all map tasks <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/VCORES_MILLIS_MAPS> 0 0 599687911 Total vcore-seconds taken by all reduce tasks <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.JobCounter/VCORES_MILLIS_REDUCES> 0 0 103258105 Map-Reduce Framework Name Map Reduce Total Combine input records <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/COMBINE_INPUT_RECORDS> 0 0 0 Combine output records <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/COMBINE_OUTPUT_RECORDS> 0 0 0 CPU time spent (ms) <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/CPU_MILLISECONDS> 162773540 90154160 252927700 Failed Shuffles <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/FAILED_SHUFFLE> 0 0 0 GC time elapsed (ms) <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/GC_TIME_MILLIS> 7667781 1607188 9274969 Input split bytes <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/SPLIT_RAW_BYTES> 52548 0 52548 Map input records <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MAP_INPUT_RECORDS> 861890673 0 861890673 Map output bytes <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MAP_OUTPUT_BYTES> 1488284643774 0 1488284643774 Map output materialized bytes <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MAP_OUTPUT_MATERIALIZED_BYTES> 1515865164102 0 1515865164102 Map output records <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MAP_OUTPUT_RECORDS> 13790250768 0 13790250768 Merged Map outputs <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/MERGED_MAP_OUTPUTS> 0 3132 3132 Physical memory (bytes) snapshot <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/PHYSICAL_MEMORY_BYTES> 192242380800 4546826240 196789207040 Reduce input groups <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/REDUCE_INPUT_GROUPS> 0 861890673 861890673 Reduce input records <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/REDUCE_INPUT_RECORDS> 0 13790250768 13790250768 Reduce output records <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/REDUCE_OUTPUT_RECORDS> 0 13790250768 13790250768 Reduce shuffle bytes <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/REDUCE_SHUFFLE_BYTES> 0 1515865164102 1515865164102 Shuffled Maps <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/SHUFFLED_MAPS> 0 3132 3132 Spilled Records <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/SPILLED_RECORDS> 27580501536 23694179168 51274680704 Total committed heap usage (bytes) <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/COMMITTED_HEAP_BYTES> 186401685504 3023044608 189424730112 Virtual memory (bytes) snapshot <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.TaskCounter/VIRTUAL_MEMORY_BYTES> 537370951680 19158048768 556529000448 Phoenix MapReduce Import Name Map Reduce Total Upserts Done <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Phoenix%20MapReduce%20Import/Upserts%20Done> 861890673 0 861890673 Shuffle Errors Name Map Reduce Total BAD_ID <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/BAD_ID> 0 0 0 CONNECTION <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/CONNECTION> 0 0 0 IO_ERROR <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/IO_ERROR> 0 0 0 WRONG_LENGTH <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/WRONG_LENGTH> 0 0 0 WRONG_MAP <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/WRONG_MAP> 0 0 0 WRONG_REDUCE <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/Shuffle%20Errors/WRONG_REDUCE> 0 0 0 File Input Format Counters Name Map Reduce Total Bytes Read <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter/BYTES_READ> 186395934997 0 186395934997 File Output Format Counters Name Map Reduce Total Bytes Written <http://headnode0.ctlynvnzlysu3nnyyhqmcwjbee.gx.internal.cloudapp.net:19888/jobhistory/singlejobcounter/job_1442389862209_0002/org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter/BYTES_WRITTEN> 0 363027342839 363027342839 On 16 September 2015 at 11:46, Gabriel Reid <gabriel.r...@gmail.com> wrote: > Can you view (and post) the job counters values from the import job? > These should be visible in the job history server. > > Also, did you see the import tool exit successfully (in the terminal > where you started it?) > > - Gabriel > > On Wed, Sep 16, 2015 at 6:24 PM, Gaurav Kanade <gaurav.kan...@gmail.com> > wrote: > > Hi guys > > > > I was able to get this to work after using bigger VMs for data nodes; > > however now the bigger problem I am facing is after my MR job completes > > successfully I am not seeing any rows loaded in my table (count shows 0 > both > > via phoenix and hbase) > > > > Am I missing something simple ? > > > > Thanks > > Gaurav > > > > > > On 12 September 2015 at 11:16, Gabriel Reid <gabriel.r...@gmail.com> > wrote: > >> > >> Around 1400 mappers sounds about normal to me -- I assume your block > >> size on HDFS is 128 MB, which works out to 1500 mappers for 200 GB of > >> input. > >> > >> To add to what Krishna asked, can you be a bit more specific on what > >> you're seeing (in log files or elsewhere) which leads you to believe > >> the data nodes are running out of capacity? Are map tasks failing? > >> > >> If this is indeed a capacity issue, one thing you should ensure is > >> that map output comression is enabled. This doc from Cloudera explains > >> this (and the same information applies whether you're using CDH or > >> not) - > >> > http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_23_3.html > >> > >> In any case, apart from that there isn't any basic thing that you're > >> probably missing, so any additional information that you can supply > >> about what you're running into would be useful. > >> > >> - Gabriel > >> > >> > >> On Sat, Sep 12, 2015 at 2:17 AM, Krishna <research...@gmail.com> wrote: > >> > 1400 mappers on 9 nodes is about 155 mappers per datanode which sounds > >> > high > >> > to me. There are very few specifics in your mail. Are you using YARN? > >> > Can > >> > you provide details like table structure, # of rows & columns, etc. Do > >> > you > >> > have an error stack? > >> > > >> > > >> > On Friday, September 11, 2015, Gaurav Kanade <gaurav.kan...@gmail.com > > > >> > wrote: > >> >> > >> >> Hi All > >> >> > >> >> I am new to Apache Phoenix (and relatively new to MR in general) but > I > >> >> am > >> >> trying a bulk insert of a 200GB tar separated file in an HBase table. > >> >> This > >> >> seems to start off fine and kicks off about ~1400 mappers and 9 > >> >> reducers (I > >> >> have 9 data nodes in my setup). > >> >> > >> >> At some point I seem to be running into problems with this process as > >> >> it > >> >> seems the data nodes run out of capacity (from what I can see my data > >> >> nodes > >> >> have 400GB local space). It does seem that certain reducers eat up > most > >> >> of > >> >> the capacity on these - thus slowing down the process to a crawl and > >> >> ultimately leading to Node Managers complaining that Node Health is > bad > >> >> (log-dirs and local-dirs are bad) > >> >> > >> >> Is there some inherent setting I am missing that I need to set up for > >> >> the > >> >> particular job ? > >> >> > >> >> Any pointers would be appreciated > >> >> > >> >> Thanks > >> >> > >> >> -- > >> >> Gaurav Kanade, > >> >> Software Engineer > >> >> Big Data > >> >> Cloud and Enterprise Division > >> >> Microsoft > > > > > > > > > > -- > > Gaurav Kanade, > > Software Engineer > > Big Data > > Cloud and Enterprise Division > > Microsoft > -- Gaurav Kanade, Software Engineer Big Data Cloud and Enterprise Division Microsoft