This is great. I'll take a look today.
On Wed, Feb 26, 2020 at 9:42 AM Chuck Yang wrote:
> Hi Devs,
>
> I was able to get around to working on Avro file loads to BigQuery in
> Python SDK and now have a PR available at
> https://github.com/apache/beam/pull/10979 . Comments appreciated :)
>
>
Hi Devs,
I was able to get around to working on Avro file loads to BigQuery in
Python SDK and now have a PR available at
https://github.com/apache/beam/pull/10979 . Comments appreciated :)
Thanks,
Chuck
On Wed, Nov 27, 2019 at 10:10 AM Chuck Yang wrote:
>
> I would love to fix this, but not
I would love to fix this, but not sure if I have the bandwidth at the
moment. Anyway, created the jira here:
https://jira.apache.org/jira/browse/BEAM-8841
Thanks!
Chuck
--
*Confidentiality Note:* We care about protecting our proprietary
information, confidential material, and trade secrets.
I don't believe so, please create one (we can dedup if we happen to find
another issue).
Even better if you can contribute to fix this :)
Thanks,
Cham
On Tue, Nov 26, 2019 at 7:07 PM Chuck Yang wrote:
> Has anyone looked into implementing this for the Python SDK? It would
> be nice to have it
I'll take a look as well. Thanks for doing this!
On Fri, Oct 4, 2019 at 9:16 PM Pablo Estrada wrote:
> Thanks Steve!
> I'll take a look next week. Sorry about the delay so far.
> Best
> -P.
>
> On Fri, Sep 27, 2019 at 10:37 AM Steve Niemitz
> wrote:
>
>> I put up a semi-WIP pull request
Thanks Steve!
I'll take a look next week. Sorry about the delay so far.
Best
-P.
On Fri, Sep 27, 2019 at 10:37 AM Steve Niemitz wrote:
> I put up a semi-WIP pull request https://github.com/apache/beam/pull/9665 for
> this. The initial results look good. I'll spend some time soon adding
> unit
I put up a semi-WIP pull request https://github.com/apache/beam/pull/9665 for
this. The initial results look good. I'll spend some time soon adding
unit tests and documentation, but I'd appreciate it if someone could take a
first pass over it.
On Wed, Sep 18, 2019 at 6:14 PM Pablo Estrada
Thanks for offering to work on this! It would be awesome to have it. I can
say that we don't have that for Python ATM.
On Mon, Sep 16, 2019 at 10:56 AM Steve Niemitz wrote:
> Our experience has actually been that avro is more efficient than even
> parquet, but that might also be skewed from our
Our experience has actually been that avro is more efficient than even
parquet, but that might also be skewed from our datasets.
I might try to take a crack at this, I found
https://issues.apache.org/jira/browse/BEAM-2879 tracking it (which
coincidentally references my thread from a couple years
It's been talked about, but nobody's done anything. There as some
difficulties related to type conversion (json and avro don't support the
same types), but if those are overcome then an avro version would be much
more efficient. I believe Parquet files would be even more efficient if you
wanted to
Has anyone investigated using avro rather than json to load data into
BigQuery using BigQueryIO (+ FILE_LOADS)?
I'd be interested in enhancing it to support this, but I'm curious if
there's any prior work here.
11 matches
Mail list logo