Thanks Ted!
On Thu, Jan 14, 2016 at 4:49 PM, Ted Yu wrote:
> For #1, yes it is possible.
>
> You can find some example in hbase-spark module of hbase where hbase as
> DataSource is provided.
> e.g.
>
> https://github.com/apache/hbase/blob/master/hbase-spark/src/main/scala/org/apache/hadoop/hbase/
For #1, yes it is possible.
You can find some example in hbase-spark module of hbase where hbase as
DataSource is provided.
e.g.
https://github.com/apache/hbase/blob/master/hbase-spark/src/main/scala/org/apache/hadoop/hbase/spark/HBaseRDDFunctions.scala
Cheers
On Thu, Jan 14, 2016 at 5:04 AM, K
Hi
We have a RDD that needs to be mapped with information from
HBase, where the exact key is the user id.
What's the different alternatives for doing this?
- Is it possible to do HBase.get() requests from a map function in Spark?
- Or should we join RDDs with all full HBase table scan?
I ask be
I wanted to confirm whether this is now supported, such as in Spark v1.3.0
I've read varying info online & just thought I'd verify.
Thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p24117.html
Sent from th
The 4.1 GB table has 3 regions. This means that there would be at least 2
nodes which don't carry its region.
Can you split this table into 12 (or more) regions ?
BTW what's the value for spark.yarn.executor.memoryOverhead ?
Cheers
On Sat, Mar 14, 2015 at 10:52 AM, francexo83 wrote:
> Hi all,
Hi all,
I have the following cluster configurations:
- 5 nodes on a cloud environment.
- Hadoop 2.5.0.
- HBase 0.98.6.
- Spark 1.2.0.
- 8 cores and 16 GB of ram on each host.
- 1 NFS disk with 300 IOPS mounted on host 1 and 2.
- 1 NFS disk with 300 IOPS mounted on host
es not exist in the JVM'.
> --> 659 format(self._fqn + name))
> 660
> 661 def __call__(self, *args):
>
> Py4JError: org.apache.spark.api.python.PythonRDDnewAPIHadoopFile does not
> exist in the JVM
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6507.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py
in __getattr__(self, name)
657 else:
658 raise Py4JError('{0} does not exist in the JVM'.
--> 659 format(self._fqn + name))
660
661 def __call__(self, *args):
Py4JError: org.apache.spa
e',
value_class='org.apache.hadoop.hbase.client.Result'
Is it possible that the typo is coming from inside the spark code?
Tommer
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6506.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
rmats branch and cannot find any
> reference to the class org.apache.spark.api.python.PythonRDDnewAPIHadoopFile
>
> Any ideas?
>
> Also, do you have a working example of HBase access with the new code?
>
> Thanks
>
> Tommer
>
>
>
> --
> View thi
reference to the class org.apache.spark.api.python.PythonRDDnewAPIHadoopFile
Any ideas?
Also, do you have a working example of HBase access with the new code?
Thanks
Tommer
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6502.html
Thanks Nick and Matei. I'll take a look at the patch and keep you updated.
Tommer
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142p6176.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Format class. Is there any equivalent in python?
>>
>> Thanks
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
t in python?
>
> Thanks
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
there any equivalent in python?
Thanks
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Python-Spark-and-HBase-tp6142.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
an index for each column in my table and I store complex
>>>>> object within the cells. Is it correct?
>>>>>
>>>>> Best,
>>>>> Flavio
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 8, 2014 at 6:05 PM, Bin Wang wrote:
>>>>>
>>>>>> Hi Flavio,
>>>>>>
>>>>>> I happened to attend, actually attending the 2014 Apache Conf, I
>>>>>> heard a project called "Apache Phoenix", which fully leverage HBase and
>>>>>> suppose to be 1000x faster than Hive. And it is not memory bounded, in
>>>>>> which case sets up a limit for Spark. It is still in the incubating group
>>>>>> and the "stats" functions spark has already implemented are still on the
>>>>>> roadmap. I am not sure whether it will be good but might be something
>>>>>> interesting to check out.
>>>>>>
>>>>>> /usr/bin
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 8, 2014 at 9:57 AM, Flavio Pompermaier <
>>>>>> pomperma...@okkam.it> wrote:
>>>>>>
>>>>>>> Hi to everybody,
>>>>>>>
>>>>>>> in these days I looked a bit at the recent evolution of the big
>>>>>>> data stacks and it seems that HBase is somehow fading away in favour of
>>>>>>> Spark+HDFS. Am I correct?
>>>>>>> Do you think that Spark and HBase should work together or not?
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Flavio
>>>>>>>
>>>>>>
>>>>
>>>
>>
>
>>>>>
>>>>>> On Tue, Apr 8, 2014 at 6:05 PM, Bin Wang wrote:
>>>>>> Hi Flavio,
>>>>>>
>>>>>> I happened to attend, actually attending the 2014 Apache Conf, I heard a
>>>>>> project called "Apache Phoenix", which fully leverage HBase and suppose
>>>>>> to be 1000x faster than Hive. And it is not memory bounded, in which
>>>>>> case sets up a limit for Spark. It is still in the incubating group and
>>>>>> the "stats" functions spark has already implemented are still on the
>>>>>> roadmap. I am not sure whether it will be good but might be something
>>>>>> interesting to check out.
>>>>>>
>>>>>> /usr/bin
>>>>>>
>>>>>>
>>>>>>> On Tue, Apr 8, 2014 at 9:57 AM, Flavio Pompermaier
>>>>>>> wrote:
>>>>>>> Hi to everybody,
>>>>>>> in these days I looked a bit at the recent evolution of the big data
>>>>>>> stacks and it seems that HBase is somehow fading away in favour of
>>>>>>> Spark+HDFS. Am I correct?
>>>>>>> Do you think that Spark and HBase should work together or not?
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Flavio
>>>>>
>
ich fully leverage HBase and suppose
>>>>> to be 1000x faster than Hive. And it is not memory bounded, in which case
>>>>> sets up a limit for Spark. It is still in the incubating group and the
>>>>> "stats" functions spark has already implemented are still on the roadmap.
>>>>> I
>>>>> am not sure whether it will be good but might be something interesting to
>>>>> check out.
>>>>>
>>>>> /usr/bin
>>>>>
>>>>>
>>>>> On Tue, Apr 8, 2014 at 9:57 AM, Flavio Pompermaier <
>>>>> pomperma...@okkam.it> wrote:
>>>>>
>>>>>> Hi to everybody,
>>>>>>
>>>>>> in these days I looked a bit at the recent evolution of the big
>>>>>> data stacks and it seems that HBase is somehow fading away in favour of
>>>>>> Spark+HDFS. Am I correct?
>>>>>> Do you think that Spark and HBase should work together or not?
>>>>>>
>>>>>> Best regards,
>>>>>> Flavio
>>>>>>
>>>>>
>>>
>>
>
nix", which fully leverage HBase and suppose
>>>> to be 1000x faster than Hive. And it is not memory bounded, in which case
>>>> sets up a limit for Spark. It is still in the incubating group and the
>>>> "stats" functions spark has already implemented
> /usr/bin
>>>
>>>
>>> On Tue, Apr 8, 2014 at 9:57 AM, Flavio Pompermaier >> > wrote:
>>>
>>>> Hi to everybody,
>>>>
>>>> in these days I looked a bit at the recent evolution of the big data
>>>> stacks and it seems that HBase is somehow fading away in favour of
>>>> Spark+HDFS. Am I correct?
>>>> Do you think that Spark and HBase should work together or not?
>>>>
>>>> Best regards,
>>>> Flavio
>>>>
>>>
>
>
>> /usr/bin
>>
>>
>> On Tue, Apr 8, 2014 at 9:57 AM, Flavio Pompermaier
>> wrote:
>>
>>> Hi to everybody,
>>>
>>> in these days I looked a bit at the recent evolution of the big data
>>> stacks and it seems that HBase is somehow fading away in favour of
>>> Spark+HDFS. Am I correct?
>>> Do you think that Spark and HBase should work together or not?
>>>
>>> Best regards,
>>> Flavio
>>>
>>
gt; out.
>
> /usr/bin
>
>
> On Tue, Apr 8, 2014 at 9:57 AM, Flavio Pompermaier
> wrote:
>
>> Hi to everybody,
>>
>> in these days I looked a bit at the recent evolution of the big data
>> stacks and it seems that HBase is somehow fading away in favour
somehow fading away in favour of
> Spark+HDFS. Am I correct?
> Do you think that Spark and HBase should work together or not?
>
> Best regards,
> Flavio
>
se days I looked a bit at the recent evolution of the big data
> stacks and it seems that HBase is somehow fading away in favour of
> Spark+HDFS. Am I correct?
> Do you think that Spark and HBase should work together or not?
>
> Best regards,
> Flavio
>
Hi to everybody,
in these days I looked a bit at the recent evolution of the big data stacks
and it seems that HBase is somehow fading away in favour of Spark+HDFS. Am
I correct?
Do you think that Spark and HBase should work together or not?
Best regards,
Flavio
25 matches
Mail list logo