Hi Christian,

Currently, I am using HTML tidy to reformat the XML output.  It gives me
the formatting I need, which is Git-diff friendly.

Jonathan


$ nodes % tidy --version

HTML Tidy for Apple macOS version 5.6.0

$ nodes % tidy -config tidy.config  03-luke.xml

Sample Output:

<?xml version="1.0"?>
<Sentences>
    <Sentence ref="LUK 1:1!1-1:4!8">
        <Trees>
            <Tree>
                <Node Cat="S"
                      Head="0"
                      nodeId="420010010010421">
                    <Node Cat="CL"
                          Start="0"
                          End="41"
                          Rule="ClCl"
                          Head="0"
                          nodeId="420010010010420">
                        <Node Cat="CL"
                              Start="0"
                              End="33"
                              Rule="ClCl"
                              Head="0"
                              nodeId="420010010010340">
                            <Node Cat="CL"
                                  Start="0"
                                  End="31"
                                  Rule="ClCl2"
                                  Head="1"
                                  nodeId="420010010010320">
                                <Node Cat="CL"
                                      Start="0"
                                      End="22"
                                      Rule="sub-CL"
                                      nodeId="420010010010230">
                                    <Node xml:id="n42001001001"
                                          ref="LUK 1:1!1"
                                          Cat="conj"
                                          Start="0"
                                          End="0"
                                          StrongNumber="1895"
                                          UnicodeLemma="ἐπειδήπερ"
                                          FunctionalTag="CONJ"
                                          Type=""
                                          morphId="42001001001"
                                          NormalizedForm="Ἐπειδήπερ"
                                          Unicode="Ἐπειδήπερ"
                                          FormalTag="CONJ"

tidy.config

add-xml-decl: true
drop-empty-paras: false
fix-backslash: false
fix-bad-comments: false
fix-uri: false
input-xml: true
join-styles: false
literal-attributes: true
lower-literals: false
output-xml: true
preserve-entities: true
quote-ampersand: false
quote-marks: false
quote-nbsp: false

indent: auto
indent-attributes: true
indent-spaces: 4
tab-size: 4
vertical-space: true
wrap: 150

char-encoding: utf8
input-encoding: utf8
newline: CRLF
output-encoding: utf8

quiet: true



On Wed, Feb 15, 2023 at 3:06 AM Christian Grün <christian.gr...@gmail.com>
wrote:

> Hi Patrick
>
> I noticed that the attributes for the wg element had not been aligned, so
> I was wondering if you were thinking of a more advanced rule.
>
> Or would you possibly like to supply the names of the elements for which
> the alignment should take place?
>
> Best,
> Christian
>
>
>
> Patrick Durusau <patr...@durusau.net> schrieb am Mi., 15. Feb. 2023,
> 03:51:
>
>> Christian,
>>
>> Ah, no, it isn't a length of element name + attribute but the ability to
>> align attributes for an element as you see in my post for the <w
>> element. Each key/value is followed by a line return.
>>
>> In the mean time, the current version of tidy has been added to the
>> workflow to produce the desired results.
>>
>> But it would be great to have it native to BaseX!
>>
>> Thanks!
>>
>> Patrick
>>
>> On 2/14/23 01:30, Christian Grün wrote:
>> > Hi Patrick,
>> >
>> > There’s currently no serialization parameter to control the custom
>> > indentation of attributes.
>> >
>> > If I get you correctly, you’d like to get attributes indented if the
>> > string length of the element name and the attributes exceed a specific
>> > maximum length?
>> >
>> > Best,
>> > Christian
>> >
>> >
>> > On Mon, Feb 13, 2023 at 9:10 PM Patrick Durusau <patr...@durusau.net>
>> wrote:
>> >> Greetings!
>> >>
>> >> I've been tasked with using BaseX to produce:
>> >>
>> >> *****
>> >>
>> >>            <wg class="cl" rule="S-IO" cltype="VerbElided">
>> >>               <wg rule="NpaNp" role="s">
>> >>                  <wg type="group" appositioncontainer="true"
>> rule="Np-Appos">
>> >>                     <w ref="PHM 1:1!1"
>> >>                        after=" "
>> >>                        class="noun"
>> >>                        gbiType="proper"
>> >>                        xml:id="n57001001001"
>> >>                        lemma="Παῦλος"
>> >>                        normalized="Παῦλος"
>> >>                        strong="3972"
>> >>                        number="singular"
>> >>                        gender="masculine"
>> >>                        case="nominative"
>> >>                        gloss="Paul"
>> >>                        domain="093001"
>> >>                        ln="93.294a"
>> >>                        morph="N-NSM"
>> >>                        unicode="Παῦλος">Παῦλος</w>
>> >>
>> >> *****
>> >>
>> >> The indenting is easy enough and I can even make it deeper if required
>> >> but is there a command for serialization that will properly format the
>> >> attributes?
>> >>
>> >> My personal suspicion is that inserting \n when each attribute is
>> >> serialized (and not on the last one) is the easier route but I promised
>> >> to investigate the command line.
>> >>
>> >> Have I overlooked something in the very fine manual?
>> >>
>> >> Hope everyone is having a great week!
>> >>
>> >> Patrick
>> >>
>> >> --
>> >> Patrick Durusau
>> >> patr...@durusau.net
>> >> Technical Advisory Board, OASIS (TAB)
>> >> Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
>> >> Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
>> >>
>> >> Another Word For It (blog): http://tm.durusau.net
>> >> Homepage: http://www.durusau.net
>> >> Twitter: patrickDurusau
>> >>
>> --
>> Patrick Durusau
>> patr...@durusau.net
>> Technical Advisory Board, OASIS (TAB)
>> Editor, OpenDocument Format TC (OASIS), Project Editor ISO/IEC 26300
>> Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps)
>>
>> Another Word For It (blog): http://tm.durusau.net
>> Homepage: http://www.durusau.net
>> Twitter: patrickDurusau
>>
>>

Reply via email to