Agreed.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.massstreet.net
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: Sam Groth 
Sent: Wednesday, May 11, 2016 11:11 AM
To: [email protected] 
Subject: Re: is this an appropirate Avro use case?

So there are 2 possible cases that I see: 1) You are able to get the data 
producer to switch to Avro using type int/double for the number fields. Then 
they would be forced to follow the types in the schema. 2) You write a data 
cleansing layer to fix inconsistencies and handle schema changes. In this case, 
I don't see any advantage to using Avro.


Sam 



On Wednesday, May 11, 2016 10:49 AM, Bob Wakefield 
<[email protected]> wrote:




If I’ve been following properly it sounds like while the schema change would be 
handled, data cleansing would still have to be coded. I was thinking of 
converting from CSV to Avro but then I’d have to convert back to CSV to shove 
it into the database. I’m not opposed to doing that, I just don’t think it 
solves my problem with the negative numbers data type issue unless Avro 
understands (200) = –200.

Adaryl "Bob" Wakefield, MBA
Principal
Mass Street Analytics, LLC
913.938.6685
www.massstreet.net
www.linkedin.com/in/bobwakefieldmba
Twitter: @BobLovesData

From: kppublicmail . 
Sent: Wednesday, May 11, 2016 10:35 AM
To: [email protected] 
Subject: Re: is this an appropirate Avro use case?

One another option is to convert CSV file to avro before being consumed.
Thanks.
On May 9, 2016 8:58 PM, "Sean Busbey" <[email protected]> wrote:

  On Mon, May 9, 2016 at 12:21 PM, Koert Kuipers <[email protected]> wrote:
  > you cannot use avro to ensure the data comes in the format you expect (the
  > negative numbers issue). you will have to parse these variations before
  > converting to avro.


  Unless, of course, you can get the folks sending you data to agree to
  send it in Avro. If you specifically get them to send the numbers
  coded as one of the number types in Avro (rather than i.e. a string),
  you'd be able to parse it the same way all of the time.




  --
  busbey



Reply via email to