Thanks Geert and Erik

Now I am able to download the file as they are ASCII encoded but the 
downloaded text has "&" character present in it. Ingesting those text, 
wrapped inside an element constructor is throwing an invalid entity ref 
exception. Is there any way to replace all "&" character by "&" before 
inserting into ML.

Sample Code:

let $downloaded-text := xdmp:http-get(
                                    $file-path,
                                    <options xmlns="xdmp:document-get">
                                       <encoding>ASCII</encoding>
                                       <repair>full</repair>
                                    </options> 
                                 )
return
xdmp:document-insert("sample",<root>{$downloaded-text}</root>)


XDMP-ENTITYREF: (err:XPST0003) Invalid entity reference " "


Thanks
Abhishek Srivastav
Systems Engineer
Tata Consultancy Services
Cell:- +91-9883389968
Mailto: [email protected]
Website: http://www.tcs.com
____________________________________________
Experience certainty.   IT Services
                        Business Solutions
                        Outsourcing
____________________________________________



From:
Geert Josten <[email protected]>
To:
MarkLogic Developer Discussion <[email protected]>
Date:
06/26/2012 11:57 AM
Subject:
Re: [MarkLogic Dev General] UTF -8 Encoding Exception
Sent by:
[email protected]



Hi Abhishek,
 
Did you try xdmp:unquote with repair-full option? There are also some 
format option that might interest you.
 
http://community.marklogic.com/pubs/5.0/apidocs/Ext-5.html#xdmp:unquote
 
Kind regards,
Geert
 
 
Van: [email protected] [mailto:
[email protected]] Namens Abhishek53 S
Verzonden: maandag 25 juni 2012 16:01
Aan: MarkLogic Developer Discussion
Onderwerp: Re: [MarkLogic Dev General] UTF -8 Encoding Exception
 

Hi Geert, 

Thanks for prompt reply! Is there any way to convert Non UTF 8 encoded 
file to UTF -8 encoded through some different API? The downloaded text 
file has invalid XML characters like &#12; which needs to be pre-processed 
before updating this to a XML file. 

Thanks
Abhishek Srivastav
Systems Engineer
Tata Consultancy Services
Cell:- +91-9883389968
Mailto: [email protected]
Website: http://www.tcs.com
____________________________________________
Experience certainty.        IT Services
                       Business Solutions
                       Outsourcing
____________________________________________ 


From: 
Geert Josten <[email protected]> 
To: 
MarkLogic Developer Discussion <[email protected]> 
Date: 
06/25/2012 06:41 PM 
Subject: 
Re: [MarkLogic Dev General] UTF -8 Encoding Exception 
Sent by: 
[email protected]
 




Hi Abhishek, 
  
The encoding option is not to specify a target encoding for conversion, 
but to specify the encoding of the file you try to download. So, you 
should figure out which encoding file-location.txt itself has, and just 
specify that.. 
  
Kind regards, 
Geert 
  
Van: [email protected] [mailto:
[email protected]] Namens Abhishek53 S
Verzonden: maandag 25 juni 2012 14:51
Aan: MarkLogic Developer Discussion
Onderwerp: [MarkLogic Dev General] UTF -8 Encoding Exception 
  

Hi Folks, 

I am having issue in downloading non UTF 8 encoded text file from file 
server. I am using http-get method to download text files and then 
updating the text inside XML documents. 

How to convert non UTF 8 to UTF 8 encoded? 

Sample Code 
xdmp:http-get("file-location.txt", 
        <options xmlns="xdmp:document-get"> 
                       <encoding>utf-8</encoding> 
             </options> 

) 

Exception: XDMP-DOCUTF8SEQ: -- document is not UTF-8 encoded 
Please let me know your suggestion 

Thanks
Abhishek Srivastav
Systems Engineer
Tata Consultancy Services
Cell:- +91-9883389968
Mailto: [email protected]
Website: http://www.tcs.com
____________________________________________
Experience certainty.        IT Services
                       Business Solutions
                       Outsourcing
____________________________________________ 
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain 
confidential or privileged information. If you are 
not the intended recipient, any dissemination, use, 
review, distribution, printing or copying of the 
information contained in this e-mail message 
and/or attachments to it are strictly prohibited. If 
you have received this communication in error, 
please notify us by reply e-mail or telephone and 
immediately and permanently delete the message 
and any attachments. Thank you
_______________________________________________
General mailing list
[email protected]
http://community.marklogic.com/mailman/listinfo/general
_______________________________________________
General mailing list
[email protected]
http://community.marklogic.com/mailman/listinfo/general


_______________________________________________
General mailing list
[email protected]
http://community.marklogic.com/mailman/listinfo/general

Reply via email to