Hi, Abhishek:
xdmp:binary-decode() should do the conversion once you know.
xdmp:encoding-language-detect() will give you some guesses.
One popular post on encodings advises that it's best to ask
the publisher of the text instead of trying to guess:
http://www.joelonsoftware.com/printerFriendly/articles/Unicode.html
Hoping that's useful,
Erik Hennum
________________________________________
From: [email protected]
[[email protected]] On Behalf Of Abhishek53 S
[[email protected]]
Sent: Monday, June 25, 2012 7:00 AM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] UTF -8 Encoding Exception
Hi Geert,
Thanks for prompt reply! Is there any way to convert Non UTF 8 encoded file to
UTF -8 encoded through some different API? The downloaded text file has invalid
XML characters like  which needs to be pre-processed before updating this
to a XML file.
Thanks
Abhishek Srivastav
Systems Engineer
Tata Consultancy Services
Cell:- +91-9883389968
Mailto: [email protected]
Website: http://www.tcs.com<http://www.tcs.com/>
____________________________________________
Experience certainty. IT Services
Business Solutions
Outsourcing
____________________________________________
From: Geert Josten <[email protected]>
To: MarkLogic Developer Discussion <[email protected]>
Date: 06/25/2012 06:41 PM
Subject: Re: [MarkLogic Dev General] UTF -8 Encoding Exception
Sent by: [email protected]
________________________________
Hi Abhishek,
The encoding option is not to specify a target encoding for conversion, but to
specify the encoding of the file you try to download. So, you should figure out
which encoding file-location.txt itself has, and just specify that..
Kind regards,
Geert
Van:
[email protected]<mailto:[email protected]>
[mailto:[email protected]<mailto:[email protected]>]
Namens Abhishek53 S
Verzonden: maandag 25 juni 2012 14:51
Aan: MarkLogic Developer Discussion
Onderwerp: [MarkLogic Dev General] UTF -8 Encoding Exception
Hi Folks,
I am having issue in downloading non UTF 8 encoded text file from file server.
I am using http-get method to download text files and then updating the text
inside XML documents.
How to convert non UTF 8 to UTF 8 encoded?
Sample Code
xdmp:http-get("file-location.txt",
<options xmlns="xdmp:document-get">
<encoding>utf-8</encoding>
</options>
)
Exception: XDMP-DOCUTF8SEQ: -- document is not UTF-8 encoded
Please let me know your suggestion
Thanks
Abhishek Srivastav
Systems Engineer
Tata Consultancy Services
Cell:- +91-9883389968
Mailto: [email protected]<mailto:[email protected]>
Website: http://www.tcs.com<http://www.tcs.com/>
____________________________________________
Experience certainty. IT Services
Business Solutions
Outsourcing
____________________________________________
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you_______________________________________________
General mailing list
[email protected]
http://community.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://community.marklogic.com/mailman/listinfo/general