Can do a counter and log the first few thousand rows or something ...
On Mar 24, 2012, at 10:33 AM, Bill Graham <[email protected]> wrote: > The pattern I use with bad data is to increment a counter and return null. > Logging and error message is also good, but that could turn into a massive > log file if there's a large dataset of bad data. Would be curious to hear > others thoughts re the logging bit. > > Either way, I think this is a good change to make to AvroStorage. > > On Fri, Mar 23, 2012 at 7:03 PM, Russell Jurney > <[email protected]>wrote: > >> One record in a 125MB avro file is killing my script. I could patch >> AvroStorage() to catch the exception and return null after logging an error >> - I think. Should I? >> >> -- >> Russell Jurney twitter.com/rjurney [email protected] >> datasyndrome.com >> > > > > -- > *Note that I'm no longer using my Yahoo! email address. Please email me at > [email protected] going forward.*
