Agreed. Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.massstreet.net www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData
From: Sam Groth Sent: Wednesday, May 11, 2016 11:11 AM To: [email protected] Subject: Re: is this an appropirate Avro use case? So there are 2 possible cases that I see: 1) You are able to get the data producer to switch to Avro using type int/double for the number fields. Then they would be forced to follow the types in the schema. 2) You write a data cleansing layer to fix inconsistencies and handle schema changes. In this case, I don't see any advantage to using Avro. Sam On Wednesday, May 11, 2016 10:49 AM, Bob Wakefield <[email protected]> wrote: If I’ve been following properly it sounds like while the schema change would be handled, data cleansing would still have to be coded. I was thinking of converting from CSV to Avro but then I’d have to convert back to CSV to shove it into the database. I’m not opposed to doing that, I just don’t think it solves my problem with the negative numbers data type issue unless Avro understands (200) = –200. Adaryl "Bob" Wakefield, MBA Principal Mass Street Analytics, LLC 913.938.6685 www.massstreet.net www.linkedin.com/in/bobwakefieldmba Twitter: @BobLovesData From: kppublicmail . Sent: Wednesday, May 11, 2016 10:35 AM To: [email protected] Subject: Re: is this an appropirate Avro use case? One another option is to convert CSV file to avro before being consumed. Thanks. On May 9, 2016 8:58 PM, "Sean Busbey" <[email protected]> wrote: On Mon, May 9, 2016 at 12:21 PM, Koert Kuipers <[email protected]> wrote: > you cannot use avro to ensure the data comes in the format you expect (the > negative numbers issue). you will have to parse these variations before > converting to avro. Unless, of course, you can get the folks sending you data to agree to send it in Avro. If you specifically get them to send the numbers coded as one of the number types in Avro (rather than i.e. a string), you'd be able to parse it the same way all of the time. -- busbey
