Can you show the code inside saveASHFile ?
Maybe the partitions of the RDD need to be sorted (for 1st issue).
Cheers
On Wed, Jul 13, 2016 at 4:29 PM, yeshwanth kumar
wrote:
> Hi i am doing bulk load into HBase as HFileFormat, by
> using saveAsNewAPIHadoopFile
>
> i am
Hi i am doing bulk load into HBase as HFileFormat, by
using saveAsNewAPIHadoopFile
i am on HBase 1.2.0-cdh5.7.0 and spark 1.6
when i try to write i am getting an exception
java.io.IOException: Added a key not lexically larger than previous.
following is the code snippet
case class
I agree. I'm not an expert though. I do more pig jobs than anything.
Any one else on the thread have more experience creating MR jobs on HBase
data?
On Wednesday, July 13, 2016, Frank Luo wrote:
> It will work, but it is pretty awkward way to create more mappers.
>
>
>
>
It will work, but it is pretty awkward way to create more mappers.
From: Billy Watson [mailto:williamrwat...@gmail.com]
Sent: Wednesday, July 13, 2016 3:57 PM
To: Frank Luo
Cc: user@hbase.apache.org
Subject: Re: Re:is possible to create multiple TableSplit per region?
It
It seems like it might be faster then to consider a map job followed by
another map job. Or, depending on the web service calls, maybe a combine
step?
William Watson
Lead Software Engineer
On Wed, Jul 13, 2016 at 4:40 PM, Frank Luo wrote:
> It makes a number of web-service
It makes a number of web-service calls.
From: Billy Watson [mailto:williamrwat...@gmail.com]
Sent: Wednesday, July 13, 2016 3:27 PM
To: user@hbase.apache.org
Cc: Frank Luo
Subject: Re: Re:is possible to create multiple TableSplit per region?
What do you mean by "heavy work
What do you mean by "heavy work downstream"?
I think the mailing list might need a *few* more details to help out better.
William Watson
On Wed, Jul 13, 2016 at 12:32 PM, Frank Luo wrote:
> Thanks for the prompt reply, Lu.
>
> It is true that having a smaller region file
Thanks for the prompt reply, Lu.
It is true that having a smaller region file size can solve the problem. But it
also have side effects. For example, total number of regions can be easily
doubled/tripled, and I am already facing a challenge of having too many regions
per server. So I cannot go
here is an archived mail:
http://mail-archives.apache.org/mod_mbox/hbase-user/201303.mbox/%3cblu0-smtp19115a8967869d6cf0d49ef8f...@phx.gbl%3E
At 2016-07-13 23:20:28, "Frank Luo" wrote:
>We have mapper only jobs operating on a result of a Scan. Because of heavy
>work
We have mapper only jobs operating on a result of a Scan. Because of heavy work
downstream, the mapper runs fairly slowly. So I am wondering if there is a way
to create multiple TableSplit on one region hence multiple mappers can be
created to work on different piece of date on the region.
I
Hello guys
I have a spark-sql app which writes some data to hbase, however this app hangs
without any exception or error.
Here is my code:
//code base :https://hbase.apache.org/book.html#scala
val sparkMasterUrlDev = "spark://master60:7077"
val sparkMasterUrlLocal = "local[2]"
11 matches
Mail list logo