Thanks for the quick response, Cham. In my use case (supporting the VCF <https://samtools.github.io/hts-specs/VCFv4.2.pdf> format), each value in the repeated sequence has an associated context. In other words, the index of the values is important for determining its context and some values may be null, so [0, None, 1] is different from [None, 0, 1]. Having a generic 'default' value is also not ideal as the context may change between fields (unless we use something like sys.maxint). Your suggestion of using a repeated records would work, but it has the drawback of complicating the schema.
Anyhow, it doesn't seem like there is an easy solution, but please let me know if you have any other thoughts on this. Thanks again, Asha On Mon, Sep 18, 2017 at 1:41 PM, Chamikara Jayalath <chamik...@apache.org> wrote: > NaN and Inf values are not JSON compliant and hence not supported. We use > JSON BigQuery load when writing to BigQuery using DataflowRunner. > https://github.com/apache/beam/blob/master/sdks/python/ > apache_beam/io/gcp/bigquery.py#L155 > > Other values including 'None' are supported. Why do you need to record > 'None' values for an repeated integer field ? Can you update the table > schema to support your use-case ? For example, > > * maintaining a count of None values in a separate filed > * defining a repeated field for a record type with one nullable field > > - Cham > > > > On Mon, Sep 18, 2017 at 10:08 AM Asha Rostamianfar > <arost...@google.com.invalid> wrote: > > > Is there a way to write 'NaN' to BigQuery using the > > Python beam.io.BigQuerySink? > > > > It complains that NaN is not supported in JSON if I try using > float('NaN'). > > > > Context: given that null values are not supported in repeated fields for > > BigQuery (e.g. having [0, None, 1]), I like to find a way to represent > > 'None' values for numeric types. I thought using NaN may be a good > > workaround if possible. Any 'special' value would work for this purpose > > actually. > > > > Thanks, > > Asha > > >