I'll take a look as well. Thanks for doing this!

On Fri, Oct 4, 2019 at 9:16 PM Pablo Estrada <pabl...@google.com> wrote:

> Thanks Steve!
> I'll take a look next week. Sorry about the delay so far.
> Best
> -P.
>
> On Fri, Sep 27, 2019 at 10:37 AM Steve Niemitz <sniem...@apache.org>
> wrote:
>
>> I put up a semi-WIP pull request https://github.com/apache/beam/pull/9665 for
>> this.  The initial results look good.  I'll spend some time soon adding
>> unit tests and documentation, but I'd appreciate it if someone could take a
>> first pass over it.
>>
>> On Wed, Sep 18, 2019 at 6:14 PM Pablo Estrada <pabl...@google.com> wrote:
>>
>>> Thanks for offering to work on this! It would be awesome to have it. I
>>> can say that we don't have that for Python ATM.
>>>
>>> On Mon, Sep 16, 2019 at 10:56 AM Steve Niemitz <sniem...@apache.org>
>>> wrote:
>>>
>>>> Our experience has actually been that avro is more efficient than even
>>>> parquet, but that might also be skewed from our datasets.
>>>>
>>>> I might try to take a crack at this, I found
>>>> https://issues.apache.org/jira/browse/BEAM-2879 tracking it (which
>>>> coincidentally references my thread from a couple years ago on the read
>>>> side of this :) ).
>>>>
>>>> On Mon, Sep 16, 2019 at 1:38 PM Reuven Lax <re...@google.com> wrote:
>>>>
>>>>> It's been talked about, but nobody's done anything. There as some
>>>>> difficulties related to type conversion (json and avro don't support the
>>>>> same types), but if those are overcome then an avro version would be much
>>>>> more efficient. I believe Parquet files would be even more efficient if 
>>>>> you
>>>>> wanted to go that path, but there might be more code to write (as we
>>>>> already have some code in the codebase to convert between TableRows and
>>>>> Avro).
>>>>>
>>>>> Reuven
>>>>>
>>>>> On Mon, Sep 16, 2019 at 10:33 AM Steve Niemitz <sniem...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> Has anyone investigated using avro rather than json to load data into
>>>>>> BigQuery using BigQueryIO (+ FILE_LOADS)?
>>>>>>
>>>>>> I'd be interested in enhancing it to support this, but I'm curious if
>>>>>> there's any prior work here.
>>>>>>
>>>>>

Reply via email to