[Hadoop Wiki] Update of "Hive/HBaseBulkLoad" by JohnSic hi

Apache Wiki Fri, 16 Apr 2010 16:33:12 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hive/HBaseBulkLoad" page has been changed by JohnSichi.
http://wiki.apache.org/hadoop/Hive/HBaseBulkLoad?action=diff&rev1=9&rev2=10

--------------------------------------------------

  The second command populates it (using the sampling query previously 
defined).  Usage of ORDER BY guarantees that a single file will be produced in 
directory {{{/tmp/hb_range_keys}}}.  The filename is unknown, but it is 
necessary to reference the file by name later, so run a command such as the 
following to copy it to a specific name:
  
  {{{
- dfs -cp /tmp/hb_range_keys/* /tmp/hb_range_key_list
+ dfs -cp /tmp/hb_range_keys/* /tmp/hb_range_key_list;
  }}}
  
  = Prepare Staging Location =
  
  The sort is going to produce a lot of data, so make sure you have sufficient 
space in your HDFS cluster, and choose the location where the files will be 
staged.  We'll use {{{/tmp/hbsort}}} in this example.
+ 
+ The directory does not actually need to exist (it will be automatically 
created in the next step), but if it does exist, it should be empty.
+ 
+ {{{
+ dfs -rmr /tmp/hbsort;
+ dfs -mkdir /tmp/hbsort;
+ }}}
  
  = Sort Data =

[Hadoop Wiki] Update of "Hive/HBaseBulkLoad" by JohnSic hi

Reply via email to