Hi Christian,
Currently, I am using HTML tidy to reformat the XML output. It gives me
the formatting I need, which is Git-diff friendly.
Jonathan
$ nodes % tidy --version
HTML Tidy for Apple macOS version 5.6.0
$ nodes % tidy -config tidy.config 03-luke.xml
Sample Output:
<?xml version="1.0"?>
<Sentences>
<Sentence ref="LUK 1:1!1-1:4!8">
<Trees>
<Tree>
<Node Cat="S"
Head="0"
nodeId="420010010010421">
<Node Cat="CL"
Start="0"
End="41"
Rule="ClCl"
Head="0"
nodeId="420010010010420">
<Node Cat="CL"
Start="0"
End="33"
Rule="ClCl"
Head="0"
nodeId="420010010010340">
<Node Cat="CL"
Start="0"
End="31"
Rule="ClCl2"
Head="1"
nodeId="420010010010320">
<Node Cat="CL"
Start="0"
End="22"
Rule="sub-CL"
nodeId="420010010010230">
<Node xml:id="n42001001001"
ref="LUK 1:1!1"
Cat="conj"
Start="0"
End="0"
StrongNumber="1895"
UnicodeLemma="ἐπειδήπερ"
FunctionalTag="CONJ"
Type=""
morphId="42001001001"
NormalizedForm="Ἐπειδήπερ"
Unicode="Ἐπειδήπερ"
FormalTag="CONJ"
tidy.config
add-xml-decl: true
drop-empty-paras: false
fix-backslash: false
fix-bad-comments: false
fix-uri: false
input-xml: true
join-styles: false
literal-attributes: true
lower-literals: false
output-xml: true
preserve-entities: true
quote-ampersand: false
quote-marks: false
quote-nbsp: false
indent: auto
indent-attributes: true
indent-spaces: 4
tab-size: 4
vertical-space: true
wrap: 150
char-encoding: utf8
input-encoding: utf8
newline: CRLF
output-encoding: utf8
quiet: true
On Wed, Feb 15, 2023 at 3:06 AM Christian Grün <[email protected]>
wrote:
> Hi Patrick
>
> I noticed that the attributes for the wg element had not been aligned, so
> I was wondering if you were thinking of a more advanced rule.
>
> Or would you possibly like to supply the names of the elements for which
> the alignment should take place?
>
> Best,
> Christian
>
>
>
> Patrick Durusau <[email protected]> schrieb am Mi., 15. Feb. 2023,
> 03:51:
>
>> Christian,
>>
>> Ah, no, it isn't a length of element name + attribute but the ability to
>> align attributes for an element as you see in my post for the <w
>> element. Each key/value is followed by a line return.
>>
>> In the mean time, the current version of tidy has been added to the
>> workflow to produce the desired results.
>>
>> But it would be great to have it native to BaseX!
>>
>> Thanks!
>>
>> Patrick
>>
>> On 2/14/23 01:30, Christian Grün wrote:
>> > Hi Patrick,
>> >
>> > There’s currently no serialization parameter to control the custom
>> > indentation of attributes.
>> >
>> > If I get you correctly, you’d like to get attributes indented if the
>> > string length of the element name and the attributes exceed a specific
>> > maximum length?
>> >
>> > Best,
>> > Christian
>> >
>> >
>> > On Mon, Feb 13, 2023 at 9:10 PM Patrick Durusau <[email protected]>
>> wrote:
>> >> Greetings!
>> >>
>> >> I've been tasked with using BaseX to produce:
>> >>
>> >> *****
>> >>
>> >> <wg class="cl" rule="S-IO" cltype="VerbElided">
>> >> <wg rule="NpaNp" role="s">
>> >> <wg type="group" appositioncontainer="true"
>> rule="Np-Appos">
>> >> <w ref="PHM 1:1!1"
>> >> after=" "
>> >> class="noun"
>> >> gbiType="proper"
>> >> xml:id="n57001001001"
>> >> lemma="Παῦλος"
>> >> normalized="Παῦλος"
>> >> strong="3972"
>> >> number="singular"
>> >> gender="masculine"
>> >> case="nominative"
>> >> gloss="Paul"
>> >> domain="093001"
>> >> ln="93.294a"
>> >> morph="N-NSM"
>> >> unicode="Παῦλος">Παῦλος</w>
>> >>
>> >> *****
>> >>
>> >> The indenting is easy enough and I can even make it deeper if required
>> >> but is there a command for serialization that will properly format the
>> >> attributes?
>> >>
>> >> My personal suspicion is that inserting \n when each attribute is
>> >> serialized (and not on the last one) is the easier route but I promised
>> >> to investigate the command line.
>> >>
>> >> Have I overlooked something in the very fine manual?
>> >>
>> >> Hope everyone is having a great week!
>> >>
>> >> Patrick
>> >>
>> >> --
>> >> Patrick Durusau
>> >> [email protected]
>> >> Technical Advisory Board, OASIS (TAB)
>> >> Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
>> >> Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
>> >>
>> >> Another Word For It (blog): http://tm.durusau.net
>> >> Homepage: http://www.durusau.net
>> >> Twitter: patrickDurusau
>> >>
>> --
>> Patrick Durusau
>> [email protected]
>> Technical Advisory Board, OASIS (TAB)
>> Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
>> Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
>>
>> Another Word For It (blog): http://tm.durusau.net
>> Homepage: http://www.durusau.net
>> Twitter: patrickDurusau
>>
>>