A quick update: stripping out the carriage returns '\r' before passing to 
Ruby's CSV.parse does the trick -- nothing elsr is needed.

On Oct 27, 2012, at 11:54 AM, "Mattmann, Chris A (388J)" 
<[email protected]> wrote:

> Hey David,
> 
> Thanks man for following this up on list and for the blog post -- great work! 
> I love TIKA-593 (our REST
> server) too! :)
> 
> Cheers,
> Chris
> 
> On Oct 26, 2012, at 2:37 PM, David James wrote:
> 
>> I have found no evidence that Tika is the problem. I have found reason
>> to suspect that Ruby 1.9.3.'s CSV is acting funny. This is my
>> work-around for Ruby 1.9.3, maybe it will be useful to someone besides
>> me.
>> 
>> class TikaCSV
>> def self.parse(s)
>>   s.split(/\n(?="[^"])/).reduce([]) { |a, x| a += CSV.parse(x) }
>> end
>> end
>> 
>> I also wrote it up here:
>> http://djwonk.tumblr.com/post/34370338490/visions-of-comma-separated-values
> 

Reply via email to