David, I think it would be sufficient to say that filesystem-file is for text files only, although the remark on encoding could be a hint, but it's just that, a hint. Today, there is one line in the summary, that mentions very generically "reads a file":
Reads a file from the filesystem. The file at the specified path must be UTF-8 encoded. Specify here that it's about text files. Add a mention of external-binary to use for reading binary files. Given the very different names (certainly justified historically), it's not easy to make that connection. Oh, and while we're on documentation, why not document filesystem-directory-create? cheers, Jakob. On Thu, Jun 13, 2013 at 12:54 AM, David Lee <[email protected]> wrote: > Curious, from a user perspective what additional information would you > like from MarkLogic documentation.**** > > From a developer perspective (buried in the trees so I dont see the > forest) it seems obvious to me ... but clearly not to everyone**** > > In the docs**** > > xdmp:filesystem-file clearly states it only works on text files in UTF8 > encoding**** > > xdmp:external-binary clearly states its for binary files.**** > > ** ** > > PDF is a binary type file. In fact one should know that unless you *know > for sure that your file is UTF8 text,**** > > it should be treated as binary.**** > > **** > > Without listing every possible file type in existenc, or diverting to the > book it might take to explain the difference in filetypes, encodings, text > and binary and unicode and UTF representations etc, both now and the > imaginable future, ... what more would you like to see in the documentation > ?**** > > ** ** > > ** ** > > ** ** > > > ----------------------------------------------------------------------------- > **** > > David Lee > Lead Engineer > MarkLogic Corporation > [email protected] > Phone: +1 812-482-5224**** > > Cell: +1 812-630-7622 > www.marklogic.com > > **** > > ** ** > > *From:* [email protected] [mailto: > [email protected]] *On Behalf Of *Jakob Fix > *Sent:* Wednesday, June 12, 2013 6:46 PM > *To:* MarkLogic Developer Discussion > *Subject:* Re: [MarkLogic Dev General] problem with xdmp:filesystem-file > on Windows**** > > ** ** > > Thanks David, I didn't see the comment at the bottom of the blog post I > was referencing (my bad). We can confirm that this works as expected.**** > > I have added a Disqus comment to the ML6 documentation for > xdmp:filesystem-file in that respect. Would be great if this informaton > could find its way in the official documentation.**** > > > **** > > cheers, > Jakob.**** > > ** ** > > On Wed, Jun 12, 2013 at 6:19 PM, David Lee <[email protected]> > wrote:**** > > You need to read the file as binary not text. The link is correct.**** > > Use xdmp:external-binary to read binary files**** > > **** > > https://docs.marklogic.com/xdmp:external-binary**** > > **** > > **** > > **** > > **** > > **** > > > ----------------------------------------------------------------------------- > **** > > David Lee > Lead Engineer > MarkLogic Corporation > [email protected] > Phone: +1 812-482-5224**** > > Cell: +1 812-630-7622 > www.marklogic.com**** > > **** > > *From:* [email protected] [mailto: > [email protected]] *On Behalf Of *Jakob Fix > *Sent:* Wednesday, June 12, 2013 12:17 PM > *To:* General Mark Logic Developer Discussion > *Subject:* [MarkLogic Dev General] problem with xdmp:filesystem-file on > Windows**** > > **** > > Hi,**** > > **** > > we’re encountering a problem reading a PDF file via xdmp:filesystem-file > on a Windows 2008 server (EA2). > > The problem also exists for ML 6 and is explained here: > > > http://learnxquery.blogspot.fr/2012/08/not-possible-to-read-pdf-from-file.html > > We can read other files, like HTML or Excel, but not PDFs (we tried > several sizes and PDF versions). The file seems to be truncated (we receive > only about 2kb, or even nothing.). > > Stacktrace in qconsole:**** > > **** > [1.0-ml] XDMP-READFILE: for $r in $results -- ReadFile File is not in > UTF-8: > C:/Applications/kappav3/backend/kv3-jfix-contents-library/f-78/51758045038371473.pdf > **** > Stack Trace**** > In /qconsole/endpoints/evaler.xqy on line 276 > In > local:format-eval-result(xdmp:filesystem-file("C:/Applications/kappav3/backend/kv3-jfix-contents-library/f-78/51758045038371473.pdf")) > **** > > $results := > xdmp:filesystem-file("C:/Applications/kappav3/backend/kv3-jfix-contents-library/f-78/51758045038371473.pdf") > **** > > > Thanks, > Jakob. **** > > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general**** > > ** ** > > _______________________________________________ > General mailing list > [email protected] > http://developer.marklogic.com/mailman/listinfo/general > >
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
