Great !
couple hints.
1) if your starting and ending with strings you never need text{} ... just
use strings
2) Your use of string-join is strange, was that a copy error ?
string-join takes a sequence as its first argument , and the separator (joiner)
as the second.
I think you have misplaced a pair of ()
3) Your concat will work but is more simply done avoiding the /string()
and turning it into a string-join
like:
for $row in $xml/row
return string-join($row/Value1, $row/Value2,
$row/Value3, $row/Value4) , “	” )
both concat and string-join except and will convert its args to string so
you don’t need the /string()
And now that you see you are writing out just rows of columated text you can
cleaner.
declare function local:row( $cols )
{
string-join( $cols , “	” )
};
declare function local:table( $lines )
{
string-join($lines, “
 
”)
};
let $rows := $xml/row,
$max := max( $rows/count(*) ),
return
local:table(
local:line( for $i in 1 to $max return "Column" || $i ),
for $r in $rows return
local:line( $r/(Value1|Value2|Value3|Value4) ) )
extra credit for replacing the tab with a calculation of the spacing required
in spaces for each value
so that its atleats 1 more then the maximum column width.
And extra/Extra credit for tossing the CR/LF and instead producing simply a
sequence of strings
and let your serializer decide what LF format to use.
From: [email protected]
[mailto:[email protected]] On Behalf Of Tim
Sent: Monday, September 01, 2014 11:07 PM
To: 'MarkLogic Developer Discussion'
Subject: Re: [MarkLogic Dev General] How to force EOL characters when
downloading a text file
Hi David,
I ended up doing something similar to the following (with different header
names and values, but the construction is the same):
text {string-join(
“# Column 1 Column 2 Column 3 Column4”,
for $row in $xml/row
return concat($row/Value1/string(), “	”, $row/Value2/string(), “	”,
$row/Value3/string(), “	”, $row/Value4/string)
), “
 
” }
Tim
From:
[email protected]<mailto:[email protected]>
[mailto:[email protected]] On Behalf Of David Lee
Sent: Monday, September 01, 2014 8:44 PM
To: MarkLogic Developer Discussion
Subject: Re: [MarkLogic Dev General] How to force EOL characters when
downloading a text file
Tim, could you show a (shortened if you want, but complete) example of your
query ,
and how you invoke it and how you are getting the results ?
The behavior you describe is *probably* the serialization of XDM to Text as
described here
http://www.w3.org/TR/xquery-30/#id-serialization (or)
http://www.w3.org/TR/xslt-xquery-serialization/
( rather obtusely until you learn how to decipher W3C specification documents).
A critical issue is that the conversion of bare "text nodes" to "text" is part
of the serialization process,
not part of the node construction. Node construction with multiple children
does *not* add newlines ...
(it adds spaces - see below)
Serialization *may* add newlines, depending on exactly how your are
constructing your document, where you are sending the output and what software
and settings are used to eventually get it to where you see it.
I suspect you are outputting *only* text nodes ... which means the result is a
"Sequence of Nodes"
and falls under the category of (5.2.7 Serialization Feature)
which is full of "may"s and "musts" and "implementation-defined"
But in general most XDM (the result of an XQuery or XSLT) processors that
produce text are consistent
and if not told otherwise (via various output method declarations, command line
overrides, API settings etc.)
does this
For every item in the result
Convert that item to a "string" (using the serialization or atomization
rules for that item)
Output that string followed by a newline
( it takes about 20 pages to distill to this ... but in your case the critical
part is if you are producing
a sequence of items or a single item that wraps a sequence.
A sequence will be newline separated.
Why ? Because text nodes are treated differently during element construction
then they are by themselves.
During element construction adjacent text nodes are combined (without any
separation).
(http://www.w3.org/TR/xquery-30/ , 3.9.1.3 Content,
"Adjacent text nodes in the content sequence are merged into a single text node
by concatenating their contents, with no intervening blanks. After
concatenation, any text node whose content is a zero-length string is deleted
from the content sequence."
)
If you then serialize the element it won't have any extra spaces.
BUT ... if your XQuery produces a sequence of values (strings, dates, nodes,
whatever)
then each item *during serialization* is individually serialized, and depending
on the processor likely
newline separated.
Try this
<e>{
text {"a"}, text{"string"}, text{"is"} , text{"here"}
}</e>
You should get something like this
<e>astringishere</e>
may vary depending on various settings but will NOT put a newline between the
text nodes.
The point here is this is ONE item result (an element)
Now try this
text {"a"}, text{"string"}, text{"is"} , text{"here"}
You should get something like this:
a
string
is
here
Note: this is a sequence of FOUR items each serialized then followed by a NL.
While you're at it, you might as well discover you probably don’t need the
text{} ... which creates *nodes*,
if what you want is just strings then converting them to nodes is unnecessary,
even if you want them as a child of an element. The rules for combining
multiple text (or strings or other atomic values) is different ..
in this case it follows the element construction rules:
http://www.w3.org/TR/xquery-30/#id-content (sec 3.9.1.3)
"For each adjacent sequence of one or more atomic values returned by an
enclosed expression, a new text node is constructed, containing the result of
casting each atomic value to a string, with a single space character inserted
between adjacent values.
"
So try this:
<e>{
"a", "string", "is" , "here"
}</e>
What do you get ?
<e>a string is here</e>
Different ! ... and often baffling to people until they figure out whats going
on.
This holds true even if you extract the text back out of the node.
like:
<e>{ "a", "string", "is" , "here"}</e>/string()
or
<e>{ "a", "string", "is" , "here"}</e>/node()
A way to double check is to count the results... now many values in the above ?
4 ?
nope, 1.
count(<e>{ "a", "string", "is" , "here"}</e>/node())
count(("a", "string", "is" , "here")) --- Note I had to enclose the sequence
in() ...
4
Now if you don’t create an element, and do this directly:
"a", "string", "is" , "here"
What do you get ? No element constructor rules so were back to the
serialization of multiple items ..
so you get
a
string
is
here
This is why concat and string-join (and in V7 and later the || operator) make a
difference.
"A" || "string" || "is" || "here
concat("a", "string", "is" , "here")
string-join( ("a", "string", "is" , "here") , "" )
All produce 1 item (string) with no separators.
Same is true if you get fancy like
string-join(
for $i in 1 to 1000
return concat( "a" , "big" , "runon" , "string", "#" , $i ,
string-join(("these","are","colon","separated" ),":" ) , "" )
But stick that in an element instead of string joining and it’s a tad different
<e>{ for $i in 1 to 1000
return concat( "a" , "big" , "runon" , "string", "#" , $i ,
string-join(("these","are","colon","separated" ),":" ) }</e>
Or outside an element ...
for $i in 1 to 1000
return concat( "a" , "big" , "runon" , "string", "#" , $i ,
string-join(("these","are","colon","separated" ),":" )
All different, but once you get the rules its quite predictable, and maybe even
sane.
-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]<mailto:[email protected]>
Phone: +1 812-482-5224
Cell: +1 812-630-7622
www.marklogic.com<http://www.marklogic.com/>
From:
[email protected]<mailto:[email protected]>
[mailto:[email protected]] On Behalf Of Tim
Sent: Monday, September 01, 2014 12:24 PM
To: 'MarkLogic Developer Discussion'
Subject: [MarkLogic Dev General] How to force EOL characters when downloading a
text file
Hi Folks,
I am extracting text from an xml file which can be downloaded by a user. The
file extension is custom. To create the record I basically walk through the XML
elements and generate the corresponding text, e.g.
text{“first line”},
text{“second line”},
…
When the user downloads the file, the Windows form of linefeeds are required
(CR-LF) and I’m trying to determine how to force that, if it is in the content
type, disposition, or merely in the way in which I add linefeeds to the
generate text, e.g.
text{“first line”}, “
”, “
”
text{“second line”}, “
”, “
”
…[DAL:]
It seems that using the text{“”} directive adds the linefeed character to the
generated text without explicitly adding CR-LF.
Thank for any help with this!
Tim M.
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general