(moving to the hbase user ML) I think streaming used to work correctly in hbase 0.19 since the RowResult class was giving the value (which you had to parse out), but now that Result is made of KeyValue and they don't include the values in toString then I don't see how TableInputFormat could be used. You could write your own InputFormat that wraps around TIF that returns a specific format for each cell tho.
Hope that somehow helps, J-D 2011/2/19 Ondrej Holecek <[email protected]>: > I don't think you understand me correctly, > > I get this line: > > 72 6f 77 31 keyvalues={row1/family1:a/1298037737154/Put/vlen=1, > row1/family1:b/1298037744658/Put/vlen=1, > row1/family1:c/1298037748020/Put/vlen=1} > > I know "72 6f 77 31" is the key and the rest is value, let's call it > mapreduce-value. In this mapreduce-value there is > "row1/family1:a/1298037737154/Put/vlen=1" that is hbase-row name, hbase-column > name and hbase-timestamp. But I expect also hbase-value. > > So my question is what to do to make TableInputFormat to send also this > hbase-value. > > > Ondrej > > > On 02/19/11 16:41, ShengChang Gu wrote: >> By default, the prefix of a line >> up to the first tab character is the key and the rest of the line >> (excluding the tab character) >> will be the value. If there is no tab character in the line, then entire >> line is considered as key >> and the value is null. However, this can be customized, Use: >> >> -D stream.map.output.field.separator=. >> -D stream.num.map.output.key.fields=4 >> >> 2011/2/19 Ondrej Holecek <[email protected] <mailto:[email protected]>> >> >> Thank you, I've spend a lot of time with debuging but didn't notice >> this typo :( >> >> Now it works, but I don't understand one thing: On stdin I get this: >> >> 72 6f 77 31 keyvalues={row1/family1:a/1298037737154/Put/vlen=1, >> row1/family1:b/1298037744658/Put/vlen=1, >> row1/family1:c/1298037748020/Put/vlen=1} >> 72 6f 77 32 keyvalues={row2/family1:a/1298037755440/Put/vlen=2, >> row2/family1:b/1298037758241/Put/vlen=2, >> row2/family1:c/1298037761198/Put/vlen=2} >> 72 6f 77 33 keyvalues={row3/family1:a/1298037767127/Put/vlen=3, >> row3/family1:b/1298037770111/Put/vlen=3, >> row3/family1:c/1298037774954/Put/vlen=3} >> >> I see there is everything but value. What should I do to get value >> on stdin too? >> >> Ondrej >> >> On 02/18/11 20:01, Jean-Daniel Cryans wrote: >> > You have a typo, it's hbase.mapred.tablecolumns not >> hbase.mapred.tablecolumn >> > >> > J-D >> > >> > On Fri, Feb 18, 2011 at 6:05 AM, Ondrej Holecek <[email protected] >> <mailto:[email protected]>> wrote: >> >> Hello, >> >> >> >> I'm testing hadoop and hbase, I can run mapreduce streaming or >> pipes jobs agains text files on >> >> hadoop, but I have a problem when I try to run the same job >> against hbase table. >> >> >> >> The table looks like this: >> >> hbase(main):015:0> scan 'table1' >> >> ROW COLUMN+CELL >> >> >> >> row1 >> column=family1:a, timestamp=1298037737154, >> >> value=1 >> >> >> >> row1 >> column=family1:b, timestamp=1298037744658, >> >> value=2 >> >> >> >> row1 >> column=family1:c, timestamp=1298037748020, >> >> value=3 >> >> >> >> row2 >> column=family1:a, timestamp=1298037755440, >> >> value=11 >> >> >> >> row2 >> column=family1:b, timestamp=1298037758241, >> >> value=22 >> >> >> >> row2 >> column=family1:c, timestamp=1298037761198, >> >> value=33 >> >> >> >> row3 >> column=family1:a, timestamp=1298037767127, >> >> value=111 >> >> >> >> row3 >> column=family1:b, timestamp=1298037770111, >> >> value=222 >> >> >> >> row3 >> column=family1:c, timestamp=1298037774954, >> >> value=333 >> >> >> >> 3 row(s) in 0.0240 seconds >> >> >> >> >> >> And command I use, with the exception I get: >> >> >> >> # hadoop jar >> /usr/lib/hadoop/contrib/streaming/hadoop-streaming-0.20.2+737.jar -D >> >> hbase.mapred.tablecolumn=family1: -input table1 -output >> /mtestout45 -mapper test-map >> >> -numReduceTasks 1 -reducer test-reduce -inputformat >> org.apache.hadoop.hbase.mapred.TableInputFormat >> >> >> >> packageJobJar: >> [/var/lib/hadoop/cache/root/hadoop-unjar8960137205806573426/] [] >> >> /tmp/streamjob8218197708173702571.jar tmpDir=null >> >> 11/02/18 14:45:48 INFO mapred.JobClient: Cleaning up the staging area >> >> >> >> hdfs://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035 >> >> <http://oho-nnm.dev.chservices.cz/var/lib/hadoop/cache/mapred/mapred/staging/root/.staging/job_201102151449_0035> >> >> Exception in thread "main" java.lang.RuntimeException: Error in >> configuring object >> >> at >> >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93) >> >> at >> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64) >> >> at >> >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) >> >> at >> org.apache.hadoop.mapred.JobConf.getInputFormat(JobConf.java:597) >> >> at >> org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:926) >> >> at >> org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:918) >> >> at >> org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170) >> >> at >> org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:834) >> >> at >> org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:793) >> >> at java.security.AccessController.doPrivileged(Native Method) >> >> at javax.security.auth.Subject.doAs(Subject.java:396) >> >> at >> >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1063) >> >> at >> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:793) >> >> at >> org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:767) >> >> at >> >> org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:922) >> >> at >> org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:123) >> >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >> >> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >> >> at >> org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:50) >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >> at java.lang.reflect.Method.invoke(Method.java:597) >> >> at org.apache.hadoop.util.RunJar.main(RunJar.java:186) >> >> Caused by: java.lang.reflect.InvocationTargetException >> >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> >> at >> >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >> at >> >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >> at java.lang.reflect.Method.invoke(Method.java:597) >> >> at >> >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88) >> >> ... 23 more >> >> Caused by: java.lang.NullPointerException >> >> at >> >> org.apache.hadoop.hbase.mapred.TableInputFormat.configure(TableInputFormat.java:51) >> >> ... 28 more >> >> >> >> >> >> Can anyone tell me what I am doing wrong? >> >> >> >> Regards, >> >> Ondrej >> >> >> >> >> >> >> -- >> 阿昌 > >
