[ 
https://issues.apache.org/jira/browse/SOLR-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12888679#action_12888679
 ] 

Chris A. Mattmann commented on SOLR-1925:
-----------------------------------------

{quote}
Excel (at least the version I just tried) handled embedded newlines just fine. 
{quote}

Well not for me. I'm using MS Office 2008, on Mac OS X 10.5.6. I also tried on 
Office XP SP 2, and same behavior on a Win XP SP2 instance I have running in 
VMWare. What version are you looking at?

{quote}
AFAIK, the CSV spec doesn't recommend always using encapsulators.
{quote}

See here: http://en.wikipedia.org/wiki/Comma-separated_values, 1st Paragraph:

bq. Fields that contain a special character (comma, newline, or double quote), 
must be enclosed in double quotes.

Since we don't know what the contents of each Field's value is, it's best to 
just account for that by encapsulating within double quotes. This doesn't break 
anything, and arguably isn't any less uglier than without (that's a judgment 
call). 

{quote}
Proper escaping is an absolute necessity. You can't represent arbitrary text 
field values without it.
{quote}

How would you recommend doing so?

{quote}
If we do things correctly, we should be able to round-trip with 
http://wiki.apache.org/solr/UpdateCSV
{quote}

What's your rationale that this isn't compatible with that? Have you tried it? 
Also, I think that's a good thing to make happen in the end, but not a blocker 
to getting this into the sources? My rationale behind that is that, e.g., for 
instance XML given to Solr doesn't always round trip to the XMLReponseWriter 
(especially if the schema weeds out fields, discards them, etc.)

{quote}
Having a server process act differently on different hosts is bad. We strive to 
never use the default locale - it's a recipe for non-portability. All file 
encodings (stopword lists, etc) default to UTF-8 instead of the system locale. 
Date and number formatting is standardized and does not use the system locale. 
We missed some of these in the past (and sure enough, Solr wouldn't work 
properly when installed on a machine of a certain locale), but Robert cleaned 
all that up.
{quote}

Admittedly, I'm not an expert here, so I'll take your word for it. What's the 
host-independent way to do System.getProperty("line.separator")?

> CSV Response Writer
> -------------------
>
>                 Key: SOLR-1925
>                 URL: https://issues.apache.org/jira/browse/SOLR-1925
>             Project: Solr
>          Issue Type: New Feature
>          Components: Response Writers
>         Environment: indep. of env.
>            Reporter: Chris A. Mattmann
>            Assignee: Erik Hatcher
>             Fix For: Next
>
>         Attachments: SOLR-1925.Chheng.071410.patch.txt, 
> SOLR-1925.Mattmann.053010.patch.2.txt, SOLR-1925.Mattmann.053010.patch.3.txt, 
> SOLR-1925.Mattmann.053010.patch.txt, SOLR-1925.Mattmann.061110.patch.txt
>
>
> As part of some work I'm doing, I put together a CSV Response Writer. It 
> currently takes all the docs resultant from a query and then outputs their 
> metadata in simple CSV format. The use of a delimeter is configurable (by 
> default if there are multiple values for a particular field they are 
> separated with a | symbol).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to