I'd suggest that the hbase-downstreamer project[1] is a better place
for folks to see these examples. There's already an example for spark
streaming that does not rely on any of the new goodness in the
hbase-spark module[2].
Granted, it uses the Spark Java APIs[3], but we'd be glad to have a
scala
On Tue, Apr 19, 2016 at 11:21 AM, Ted Yu wrote:
> Clarification: in my previous email, I was not talking
> about spark-streaming-flume artifact or spark-streaming-kafka artifact.
> I was talking about examples for these projects, such
> as examples//src/main/python/streaming/flume_wordcount.py
>
Clarification: in my previous email, I was not talking
about spark-streaming-flume artifact or spark-streaming-kafka artifact.
I was talking about examples for these projects, such
as examples//src/main/python/streaming/flume_wordcount.py
On Tue, Apr 19, 2016 at 11:10 AM, Marcelo Vanzin
wrote:
On Tue, Apr 19, 2016 at 11:07 AM, Ted Yu wrote:
> The same question can be asked w.r.t. examples for other projects, such as
> flume
> and kafka.
>
The main difference being that flume and kafka integration are part of
Spark itself. HBase integration is not.
> On Tue, Apr 19, 2016 at 11:01 A
The same question can be asked w.r.t. examples for other projects,
such as flume
and kafka.
On Tue, Apr 19, 2016 at 11:01 AM, Marcin Tustin
wrote:
> Let's posit that the spark example is much better than what is available
> in HBase. Why is that a reason to keep it within Spark?
>
> On Tue, Apr
Let's posit that the spark example is much better than what is available in
HBase. Why is that a reason to keep it within Spark?
On Tue, Apr 19, 2016 at 1:59 PM, Ted Yu wrote:
> bq. HBase's current support, even if there are bugs or things that still
> need to be done, is much better than the Sp
bq. HBase's current support, even if there are bugs or things that still
need to be done, is much better than the Spark example
In my opinion, a simple example that works is better than a buggy package.
I hope before long the hbase-spark module in HBase can arrive at a state
which we can advertis
You're completely missing my point. I'm saying that HBase's current
support, even if there are bugs or things that still need to be done,
is much better than the Spark example, which is basically a call to
"SparkContext.hadoopRDD".
Spark's example is not helpful in learning how to build an HBase
a
There is an Open JIRA for fixing the documentation: HBASE-15473
I would say the refguide link you provided should not be considered as
complete.
Note it is marked as Blocker by Sean B.
On Tue, Apr 19, 2016 at 10:43 AM, Marcelo Vanzin
wrote:
> You're entitled to your own opinions.
>
> While you
'bq.' is used in JIRA to quote what other people have said.
On Tue, Apr 19, 2016 at 10:42 AM, Reynold Xin wrote:
> Ted - what's the "bq" thing? Are you using some 3rd party (e.g. Atlassian)
> syntax? They are not being rendered in email.
>
>
> On Tue, Apr 19, 2016 at 10:41 AM, Ted Yu wrote:
>
>
You're entitled to your own opinions.
While you're at it, here's some much better documentation, from the
HBase project themselves, than what the Spark example provides:
http://hbase.apache.org/book.html#spark
On Tue, Apr 19, 2016 at 10:41 AM, Ted Yu wrote:
> bq. it's actually in use right now i
Ted - what's the "bq" thing? Are you using some 3rd party (e.g. Atlassian)
syntax? They are not being rendered in email.
On Tue, Apr 19, 2016 at 10:41 AM, Ted Yu wrote:
> bq. it's actually in use right now in spite of not being in any upstream
> HBase release
>
> If it is not in upstream, then
bq. it's actually in use right now in spite of not being in any upstream
HBase release
If it is not in upstream, then it is not relevant for discussion on Apache
mailing list.
On Tue, Apr 19, 2016 at 10:38 AM, Marcelo Vanzin
wrote:
> Alright, if you prefer, I'll say "it's actually in use right
Alright, if you prefer, I'll say "it's actually in use right now in
spite of not being in any upstream HBase release", and it's more
useful than a single example file in the Spark repo for those who
really want to integrate with HBase.
Spark's example is really very trivial (just uses one of HBase
bq. create a separate tarball for them
Probably another thread can be started for the above.
I am fine with it.
On Tue, Apr 19, 2016 at 10:34 AM, Marcelo Vanzin
wrote:
> On Tue, Apr 19, 2016 at 10:28 AM, Reynold Xin wrote:
> > Yea in general I feel examples that bring in a large amount of
> de
On Tue, Apr 19, 2016 at 10:28 AM, Reynold Xin wrote:
> Yea in general I feel examples that bring in a large amount of dependencies
> should be outside Spark.
Another option to avoid the dependency problem is to not ship examples
in the distribution, and maybe create a separate tarball for them;
r
bq. I wouldn't call it "incomplete".
I would call it incomplete.
Please see HBASE-15333 'Enhance the filter to handle short, integer, long,
float and double' which is a bug fix.
Please exclude presence of related of module in vendor distro from this
discussion.
Thanks
On Tue, Apr 19, 2016 at 1
On Tue, Apr 19, 2016 at 10:20 AM, Ted Yu wrote:
> I want to note that the hbase-spark module in HBase is incomplete. Zhan has
> several patches pending review.
I wouldn't call it "incomplete". Lots of functionality is there, which
doesn't mean new ones, or more efficient implementations of existi
Corrected typo in subject.
I want to note that the hbase-spark module in HBase is incomplete. Zhan has
several patches pending review.
hbase-spark module is currently only in master branch which would be
released as 2.0
However the release date for 2.0 is unclear - probably half a year from now.
19 matches
Mail list logo