Downgrade only the KafkaIO module to the version that works for you (also
excluding any transient dependency of it) that works for us.
JC.
Lukasz Cwik schrieb am Do., 2. Mai 2019, 20:05:
> +dev
>
> On Thu, May 2, 2019 at 10:34 AM Moorhead,Richard <
> richard.moorhe...@cerner.com> wrote:
>
>>
Hi Max,
comments inline.
On 5/2/19 3:29 PM, Maximilian Michels wrote:
Couple of comments:
* Flink transforms
It wouldn't be hard to add a way to run arbitrary Flink operators
through the Beam API. Like you said, once you go down that road, you
loose the ability to run the pipeline on a
An example that I can think of as a feature that Beam could provide to
other runners is SQL. Beam SQL expands into Beam transforms, and it can run
on other runners. Flink and Spark do have SQL support because they've
invested in it, but think of smaller runners e.g. Nemo.
Of course, not all of
Thanks Ismael.
On Thu, May 2, 2019 at 7:29 PM Ismaël Mejía wrote:
> Hello,
>
> Support for Parquet in HCatalog (Hive) started on version 3.0.0
> HIVE-8838 Support Parquet through HCatalog
> https://issues.apache.org/jira/browse/HIVE-8838?attachmentSortBy=dateTime
>
> The current version used on
+dev
On Thu, May 2, 2019 at 10:34 AM Moorhead,Richard <
richard.moorhe...@cerner.com> wrote:
> In Beam 2.9.0, this check was made:
>
>
>
If people don't want to use it because crucial libraries are written in
only some language but not available in others, that makes some sense
otherwise I would think it is biased(which is what happens most of the
time). A lot of the Language arguments are biased anyways since most of
them just
In Beam 2.9.0, this check was made:
https://github.com/apache/beam/blob/2ba00576e3a708bb961a3c64a2241d9ab32ab5b3/sdks/java/io/kafka/src/main/java/org/apache/beam/sdk/io/kafka/KafkaRecordCoder.java#L132
However this logic was removed in 2.10+ in the newer ProducerRecordCoder class:
On Thu, May 2, 2019 at 6:29 AM Maximilian Michels wrote:
> Couple of comments:
>
> * Flink transforms
>
> It wouldn't be hard to add a way to run arbitrary Flink operators
> through the Beam API. Like you said, once you go down that road, you
> loose the ability to run the pipeline on a
+user@beam.apache.org
On Thu, May 2, 2019 at 9:51 AM Lukasz Cwik wrote:
> ... build pipeline ...
> pipeline_result = p.run()
> if job_taking_too_long:
> pipeline_result.cancel()
>
> Python:
>
Now it runs so well guys, what I have done is read the stream payload with
a KafkaIO but using the parameters withoutMetadata and applying
Values.String.create() which will return a PCollection and then I
can process my payload as simple records.
thanks so much
On Wed, May 1, 2019 at 1:38 PM
Hello,
Support for Parquet in HCatalog (Hive) started on version 3.0.0
HIVE-8838 Support Parquet through HCatalog
https://issues.apache.org/jira/browse/HIVE-8838?attachmentSortBy=dateTime
The current version used on Beam is 2.1.0. I filled a new JIRA to tackle it
BEAM-7209 - Update HCatalogIO to
Couple of comments:
* Flink transforms
It wouldn't be hard to add a way to run arbitrary Flink operators
through the Beam API. Like you said, once you go down that road, you
loose the ability to run the pipeline on a different Runner. And that's
precisely one of the selling points of Beam.
Hi,
Am trying to read a table in CDH's Hive which is in Parquet Format.
I am able to see the data using beeline.
However, using HCatalogIO, all I get during read are NULL values for all
rows.
I see this problem only with Parquet Format. Other File Formats work as
expected.
Has someone else
Hi Robert,
yes, that is correct. What I'm suggesting is a discussion whether Beam
should or should not support this type of pipeline customization without
hard forking code. But maybe this discussion should proceed in dev@.
Jan
On 5/2/19 1:44 PM, Robert Bradshaw wrote:
Correct, there's no
Correct, there's no out of the box way to do this. As mentioned, this
would also result in non-portable pipelines. However, even the
portability framework is set up such that runners can recognize
particular transforms and provide their own implementations thereof
(which is how translations are
Sorry if I wasn't clear, I mean in the sense that A is neither the
intersection nor union of B and C in the venn diagram below. As Max says,
what really matters is the set of features you care about.
[image: venn.png]
https://www.benfrederickson.com/images/venn.png
On Thu, May 2, 2019 at
Just to clarify - the code I posted is just a proposal, it is not
actually possible currently.
On 5/2/19 11:05 AM, Jan Lukavský wrote:
Hi,
I'd say that what Pankaj meant could be rephrased as "What if I want
to manually tune or tweak my Pipeline for specific runner? Do I have
any options
I don't understand the following statement.
"Beam's feature set is neither the intersection nor union of the feature
sets of those runners it has available as execution engines."
If Beam is neither intersection nor union how can Beam abstract anything?
Generally speaking, if there is nothing in
On Thu, May 2, 2019 at 12:13 AM Pankaj Chand wrote:
>
> I thought by choosing Beam, I would get the benefits of both (Spark and
> Flink), one at a time. Now, I'm understanding that I might not get the full
> potential from either of the two.
You get the benefit of being able to choose, without
19 matches
Mail list logo