1) Please try trunk.

2) Like in sql, a single line comment is preceded by two dashes: "--"


D

On Fri, Sep 16, 2011 at 2:10 AM, Damien Hardy <[email protected]> wrote:

> Hello there,
>
> Based on http://www.cloudera.com/blog/**2009/06/analyzing-apache-logs-**
> with-pig/<http://www.cloudera.com/blog/2009/06/analyzing-apache-logs-with-pig/>I
>  want to add geolocalisation to my haproxy raw logs stored in Hbase Table.
>
> Here is my pig script (wrapper.sh is an auto extract bash archive that
> deploy the perl script and its dependances very close to the one in my
> reference and launch it ) :
>
> DEFINE iplookup `wrapper.sh GeoIP`
> ship ('wrapper.sh')
> cache('/GeoIP/GeoIPcity.dat#**GeoIP');
>
> A = load 'log' using org.apache.pig.backend.hadoop.**
> hbase.HBaseStorage('default:**body','-gt=_f:squid_t:**201109151630
> -loadKey') AS (rowkey, data);
> B = LIMIT A 10;
> C = FOREACH B {
>        t = 
> REGEX_EXTRACT(data,'([0-9]{1,**3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\**.[0-9]{1,3}):([0-9]+)
> ',1);
>        generate rowkey, t;
> }
> D = STREAM C THROUGH iplookup AS (rowkey, ip, country_code, country, state,
> city);
> STORE D INTO 'geoip_pig' USING org.apache.pig.backend.hadoop.**
> hbase.HBaseStorage('location:**ip location:country_code location:country
> location:state location:city');
>
> I can DUMP D; without problem, I get what is promised with my
> geolocalisation :
> (_f:squid_t:20110916103000_b:**squid_s:200-+PH/I6eJ9h8Sy8/1+**
> yz2kw==,77.192.16.143,FR,**France,B9,Lyon)
> (_f:squid_t:20110916103000_b:**squid_s:200-+**
> XSr1ZpMyLGmi8iDvZ4lLQ==,80.13.**204.64,FR,France,A8,Paris)
> (_f:squid_t:20110916103000_b:**squid_s:200-+**gl66vwlvPL9Di1zzut9Bg==,178.
> **250.1.40,FR,France,,)
> (_f:squid_t:20110916103000_b:**squid_s:200-+**
> qAtjeGfssc2vkwWR4fmJQ==,86.73.**78.25,FR,France,A8,La Courneuve)
> (_f:squid_t:20110916103000_b:**squid_s:200-+wgQq1q8H/vp52//**
> EIevzA==,80.13.204.64,FR,**France,A8,Paris)
> (_f:squid_t:20110916103000_b:**squid_s:200-/**3J9EosV46v521VBlb6zxQ==,82.*
> *127.103.161,FR,France,B6,**Erquery)
> (_f:squid_t:20110916103000_b:**squid_s:200-/**3okAiWeWMmpm54Qlk7JyQ==,
> 86.75.127.253,FR,**France,B5,La Dagueni�re)
> (_f:squid_t:20110916103000_b:**squid_s:200-/**yZ09fLNWflcBlWX1BjEkA==,83.*
> *200.13.146,FR,France,A8,**Villiers-le-bel)
> (_f:squid_t:20110916103000_b:**squid_s:200-0/**HiVaFE6b1zrUTtHkV05Q==,193.
> **228.156.10,FR,France,,)
> (_f:squid_t:20110916103000_b:**squid_s:200-**0CTc6LQ9jGpgQQLwmJZxQQ==,195.
> **93.102.10,FR,France,,)
>
>
>
> But when I want to store (last line) in a new existing HTable I get the
> following error message in the reduce JT UI :
>
> java.io.IOException: java.lang.**IllegalArgumentException: No columns to
> insert
>        at org.apache.pig.backend.hadoop.**executionengine.**
> mapReduceLayer.PigMapReduce$**Reduce.runPipeline(**PigMapReduce.java:439)
>        at org.apache.pig.backend.hadoop.**executionengine.**
> mapReduceLayer.PigMapReduce$**Reduce.cleanup(PigMapReduce.**java:492)
>        at org.apache.hadoop.mapreduce.**Reducer.run(Reducer.java:178)
>        at org.apache.hadoop.mapred.**ReduceTask.runNewReducer(**
> ReduceTask.java:572)
>        at org.apache.hadoop.mapred.**ReduceTask.run(ReduceTask.**java:414)
>        at org.apache.hadoop.mapred.**Child$4.run(Child.java:270)
>        at java.security.**AccessController.doPrivileged(**Native Method)
>        at javax.security.auth.Subject.**doAs(Subject.java:396)
>        at org.apache.hadoop.security.**UserGroupInformation.doAs(**
> UserGroupInformation.java:**1127)
>        at org.apache.hadoop.mapred.**Child.main(Child.java:264)
> Caused by: java.lang.**IllegalArgumentException: No columns to insert
>        at org.apache.hadoop.hbase.**client.HTable.validatePut(**
> HTable.java:845)
>        at org.apache.hadoop.hbase.**client.HTable.doPut(HTable.**java:677)
>        at org.apache.hadoop.hbase.**client.HTable.put(HTable.java:**667)
>        at org.apache.hadoop.hbase.**mapreduce.TableOutputFormat$**
> TableRecordWriter.write(**TableOutputFormat.java:127)
>        at org.apache.hadoop.hbase.**mapreduce.TableOutputFormat$**
> TableRecordWriter.write(**TableOutputFormat.java:82)
>        at org.apache.pig.backend.hadoop.**hbase.HBaseStorage.putNext(**
> HBaseStorage.java:431)
>        at org.apache.pig.backend.hadoop.**executionengine.**
> mapReduceLayer.**PigOutputFormat$**PigRecordWriter.write(**
> PigOutputFormat.java:138)
>        at org.apache.pig.backend.hadoop.**executionengine.**
> mapReduceLayer.**PigOutputFormat$**PigRecordWriter.write(**
> PigOutputFormat.java:97)
>        at org.apache.hadoop.mapred.**ReduceTask$**
> NewTrackingRecordWriter.write(**ReduceTask.java:514)
>        at org.apache.hadoop.mapreduce.**TaskInputOutputContext.write(**
> TaskInputOutputContext.java:**80)
>        at org.apache.pig.backend.hadoop.**executionengine.**
> mapReduceLayer.PigMapReduce$**Reduce.runPipeline(**PigMapReduce.java:437)
>        ... 9 more
>
>
> Other question : what is the code for comments in pig script (except /* ...
> */) to exclude one line rapidly.
>
> I use cdh3u1 packages.
>
> Thank you for helping.
>
> Regards,
>
> --
> Damien
>

Reply via email to