[Hadoop Wiki] Update of "Hbase/Jython" by Misty

Apache Wiki Sun, 01 Nov 2015 20:55:33 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The "Hbase/Jython" page has been changed by Misty:
https://wiki.apache.org/hadoop/Hbase/Jython?action=diff&rev1=15&rev2=16

- == Accessing HBase from Jython ==
+ The HBase Wiki is in the process of being decommissioned. The info that used 
to be on this page has moved to https://hbase.apache.org/book.html#jython. 
Please update your bookmarks.
  
- This page describes the process of connecting to HBase from Jython.  These 
instructions should help in connecting from other dynamic languages running on 
the JVM like Scala, JRuby, etc.  The code mostly follows the 
[[http://wiki.apache.org/hadoop/Hbase/FAQ#1|Can someone give an example of 
basic API-usage going against hbase?]] example listed in the HBase FAQ.
- 
- == Setting Your Classpath ==
- 
- Working with HBase from Jython is pretty simple assuming you've got your 
CLASSPATH set up.  The CLASSPATH is an environment variable that is basically a 
module search path containing paths to jar files where the code you're going 
import/use lives.
- The HBase team are working on making it easy to set and get your CLASSPATH, 
but for now the way to get it is to start HBase:
- {{{
- bin/start-hbase.sh
- }}} 
- and then get the classpath like so
- {{{
- ps ax | grep regionserver
- }}}
- Which will spit out a bunch of stuff.  Within that blob of text is a 
-classpath option, which will likely contain a ton of paths to stuff.
- Alternatively, as regionserver seems not to be output any more, you could do:
- {{{
- ps auwx|grep java|grep org.apache.hadoop.hbase.master.HMaster|perl -pi -e 
"s/.*classpath //"
- }}}
- Copy that text and then do
- {{{
- export CLASSPATH=$THE_CLASSPATH_YOU_COPIED
- }}}
- My CLASSPATH then contains 24 entries.
- When you start Jython it will likely print some stuff to the screen about 
processing each of the jars listed in your CLASSPATH.
- 
- 
- An alternative is to do the following:
- 
- {{{
- $ HBASE_OPTS="-Dpython.path=$JYTHON_HOME" 
HBASE_CLASSPATH=$JYTHON_HOME/jython.jar ./bin/hbase org.python.util.jython
- Jython 2.2.1 on java1.5.0_13
- Type "copyright", "credits" or "license" for more information.
- >>> from org.apache.hadoop.hbase.client import HTable
- >>> 
- }}}
- 
- This will start up a jython shell with all of hbase and hadoop on its 
CLASSPATH. Be sure to define JYTHON_HOME so it points at your jython install.
- 
- '''Note''': trying to add the hbase jars to jython ''sys.path'' (using for 
instance the JYTHONPATH) will not work. To use Hbase jar in jython you really 
need to use a pure java CLASSPATH.
- 
- == The Code ==
- 
- Once you've got that set it's as simple as just translating the Java on the 
FAQ page to legal Jython.
- 
- The code below creates a table, puts some data in it, fetches that data back 
out and then deletes the table.
- 
- Note: BatchUpdate is now '''Deprecated'''. As of hbase 0.20.0, replaced by 
new org.apache.hadoop.hbase.client.Get
- 
- /Put/Delete/Result-based API.
- {{{
- import java.lang
- from org.apache.hadoop.hbase import HBaseConfiguration, HTableDescriptor, 
HColumnDescriptor, HConstants
- from org.apache.hadoop.hbase.client import HBaseAdmin, HTable
- from org.apache.hadoop.hbase.io import BatchUpdate, Cell, RowResult
- 
- # First get a conf object.  This will read in the configuration 
- # that is out in your hbase-*.xml files such as location of the
- # hbase master node.
- conf = HBaseConfiguration()
- 
- # Create a table named 'test' that has two column families,
- # one named 'content, and the other 'anchor'.  The colons
- # are required for column family names.
- tablename = "test"  
- 
- desc = HTableDescriptor(tablename)
- desc.addFamily(HColumnDescriptor("content:"))
- desc.addFamily(HColumnDescriptor("anchor:"))
- admin = HBaseAdmin(conf)
- 
- # Drop and recreate if it exists
- if admin.tableExists(tablename):
-     admin.disableTable(tablename)
-     admin.deleteTable(tablename)
- admin.createTable(desc)
- 
- tables = admin.listTables()
- table = HTable(conf, tablename)
- 
- # Add content to 'column:' on a row named 'row_x'
- row = 'row_x'
- update = BatchUpdate(row)
- update.put('content:', 'some content')
- table.commit(update)
- 
- # Now fetch the content just added, returns a byte[]
- data_row = table.get(row, "content:")
- data = java.lang.String(data_row.value, "UTF8")
- 
- print "The fetched row contains the value '%s'" % data
- 
- # Delete the table.
- admin.disableTable(desc.getName())
- admin.deleteTable(desc.getName())
- 
- }}}
- 
- == Another Sample ==
- {{{
- # Print all rows that are members of a particular column family 
- # by passing a regex for family qualifier
- 
- import java.lang
- 
- from org.apache.hadoop.hbase import HBaseConfiguration
- from org.apache.hadoop.hbase.client import HTable
- 
- conf = HBaseConfiguration()
- 
- table = HTable(conf, "wiki")
- col = "title:.*$"
- 
- scanner = table.getScanner([col], "")
- while 1:
-     result = scanner.next()
-     if not result:
-         break
-     print java.lang.String(result.row), 
java.lang.String(result.get('title:').value)
- 
- }}}
-

[Hadoop Wiki] Update of "Hbase/Jython" by Misty

Reply via email to