Hi James
I have been investigating this report. I summarize the current behavior,
give a detailed analysis of what's going on with a north american
Windows platform, and sketch a workaround.
1. Current behavior
You are correct, on some platforms, for example a North American Windows
installation, TBC spreadsheet export creates a text file with ?????? in
place of the arabic word.
This behavior is, surprisingly, correct - and is not going to change.
I will check behavior on Linux, which I expect to be different, and
report back later today.
2. Windows in detail
Here is my analysis of Windows.
TBC exports a tab separated text file in the platform default encoding.
This should provide for appropriate interoperation between that text
file and other applications on the same platform (for example excel).
Unfortunately on my system, and apparently on your system, the arabic
characters which we try to export are not part of the platform default
encoding, and cannot be exported into a text file readable by other
applications on your system.
A low level java utility makes the sensible decision of replacing those
characters by a ?, resulting in the observed behavior, which is correct
- in the sense that having being asked to export the query results as a
plain text spreadsheet file, we have done as good as we can reasonable do.
3. Workaround
(This is a sketch; and quite a lot of work).
TBC also provides for exporting the query results as an XML document (I
believe as a SPARQL result set document - not an area I am very familiar
with).
If you are using excel as your spreadsheet tool, then the format "XML
Spreadsheet 2003" which I found using "Save as ..." in excel is a
possible XML format.
If you are using openoffice, then that too has an XML spreadsheet
format, which will, of course, be different.
For each such format, you could write an XSLT transform that takes the
sparql results set, and converts it into the spreadsheet document required.
If you do go down this route, and would like us to integrate your
transform(s) into future versions of TBC, please get back to us, and we
will see if this is possible.
With Excel, I had a look at many of the formats that it supports on my
system, and most were binary, or were in the platform default encoding,
and did not support arabic characters on my system.
Jeremy
James A Miller wrote:
> Scott,
>
> I'm still seeing the same problem.
>
> I opened pizza.owl in TBC, added a line to the rdfs:label for the
> American pizza, containing an Arabic word جيزان {...@ar} . I saved it,
> and ran the following query:
>
> *SELECT* *
> *WHERE* {
> ?subject rdfs:label ?label
> }
>
> which gives me a set of results, including my (unchanged) Arabic
> label. Then I did Export Results to File, and entered pizza.xls as
> the name, and Tab-separated Spreadsheet as the type.
>
> The resultant file is attached. I opened it several ways, and asked
> someone else to check it out. As far as we can tell, it really has
> ?????? in the place of the Arabic text.
>
> What am I doing wrong?
>
> (Note for the Arabic-knowledgeable people out there--I realize this is
> not the name of a pizza, but that is not important for this problem...)
>
>
> Jim
>
>
> *Scott Henninger <[email protected]>*
> Sent by: [email protected]
>
> 07/14/2009 04:46 PM
> Please respond to
> [email protected]
>
>
>
> To
> TopBraid Composer Users <[email protected]>
> cc
>
> Subject
> [tbc-users] Re: Multilingual data in extracts (Replaces prior message
> with bad Subject)
>
>
>
>
>
>
>
>
>
>
> <Sorry--forgot to change the earlier Subject-->
>
> (normally it is best to start a new thread - hijacked threads make it
> difficult to find past postings)
>
> There could be issues with your file or the program you are using to
> view the tab-delimited file? As a test, open the pizza.owl file in
> TopBraid > Examples and enter the following query.
>
> SELECT *
> WHERE {
> ?subject rdfs:subClassOf owl:Thing .
> ?subject rdfs:label ?label
> }
>
> The query will find a number of Portuguese labels The .txt file is
> generated as expected. There aren't any special characters in this
> file, but if you add some, you will find that they display ok in
> editors that recognize the characters.
>
> You can try sending a sample and we'll look into it.
>
> -- Scott
>
> On Jul 14, 1:42 pm, James A Miller <[email protected]>
> wrote:
> > Sorry--forgot to change the earlier Subject--
> >
> > Maybe I am expecting too much from Composer, but when I export the
> results
> > of a SPARQL query involving multilingual data, and I choose to save as
> > type "Tab-separated spreadsheet", my foreign-language data becomes a
> > series of question marks. Should this work? Am I omitting a step?
> (When
> > I export as XML, and let Excel do spreadsheet conversion, it works
> OK (but
> > not in a preferred column/row format). )
> >
> > Jim
>
>
> The following line is added for your protection and will be used for
> analysis if this message is reported as spam:
>
> (Raytheon Analysis: IP=209.85.221.164;
> e-from=grbounce--o7mwauaaadwl_g2bqlyhjkrnqovtno4=james_a_miller=raytheon....@googlegroups.com;
>
> [email protected]; date=Jul 14, 2009 8:45:26 PM;
> subject=[tbc-users] Re: Multilingual data in extracts (Replaces prior
> message
> with bad Subject))
>
>
>
>
> >
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"TopBraid Composer Users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/topbraid-composer-users?hl=en
-~----------~----~----~----~------~----~------~--~---