R jobs

Ioakim Perros Mon, 23 Jul 2012 16:04:55 -0700

Thank you very much for responding :-)

I also found this one : http://www.deerwalk.com/bulk_importing_data ,which seems very informative.

The thing is that I tried to create and run a simple (custom) bulkloading job and I tried to run it locally (in pseudo-distributed mode) -and the following error occurs:

... INFO mapred.JobClient: Task Id :attempt_201207232344_0001_m_000000_0, Status : FAILED

java.lang.IllegalArgumentException: *Can't read partitions file*

atorg.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111)...

I followed this link, while googling for the solution :http://hbase.apache.org/book/trouble.mapreduce.htmland it implies a misconfiguration concerning a fully distributedenvironment.

I would like, therefore, to ask if it is even possible to bulk importdata in a pseudo-distributed mode and if this is the case, does anyonehave a guess about this error?


Thanks in advance!
IP


On 07/23/2012 07:40 AM, Sonal Goyal wrote:

Hi,

You can check the bulk loading section at

http://hbase.apache.org/book/arch.bulk.load.html

Best Regards,
Sonal
Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>





On Mon, Jul 23, 2012 at 6:15 AM, Ioakim Perros <[email protected]> wrote:

Hi,

Is there any efficient way (beyond the trivial using TableMapReduceUtil /
TableOutputFormat) to perform faster read and write operations to tables ?
Could anyone provide some example code of it ?

As of faster importing to table, I am aware of tools such as
completebulkload, but I would prefer triggering such a process through M/R
code, as I would like a whole table to be read and updated through
iterations of M/R jobs.

Thanks in advance!
IP

Re: Efficient read/write - Iterative M/R jobs

Reply via email to