On Tue, Nov 07, 2017 at 03:35:45AM -0800, lesm...@gmail.com wrote: > I am really struggling to access nested elements of an XML string and > suspect it is down to the namespaces. This string is obtained from a > larger document and is the "innerXML" of some elements. A simplified > version is at... > > I could probably do this with multiple structs but want to have this in a > single struct. > > https://play.golang.org/p/Een-guMNP9 > > I can seem to read things at the root but cannot get them using the ">" > syntax at all. What am I doing wrong? Can I "insert" a namespace element > to assist it at all? > > I have manually removed the namespaces from this example to show what I > think should happen!? > https://play.golang.org/p/eCzbzgBYMq
The chief problem with your approach is lack of error checking. The encoding/xml.Unmarshal() function returns an error value. Had you checked it for being set (not nil), it would have given you an immediate idea of what was wrong with your approach. Regarding namespaces, your hunch is correct: since your XML document is a fragment extracted from another document by a seemingly "textual" method, all those "XML namespace prefixes" — parts in the names of the elements which come before the ':' characters — have no meaning to the XML parser since they are not defined in the document itself. Unfortunately, currently there's no way to somehow explicitly define them anywhere (say, in an instance of encoding/xml.Decoder) before decoding, so you basically have three options: - Somehow textually stick their definition on the top element of your XML document fragrems, so, say, it reads something like <fdm:trackInformation xmlns:fdm="urn:whatever:ns1" xmlns:nxcm="http://example.com/another/namespace/uri/" ...> …and then parse the resulting document into a value of a struct type the tags on whose fields contain full namespaces in the names of the XML elements they're supposed to decode. - Use iterative approach by creating an instance of encoding/xml.Decoder and calling its Token() method. When it returns a token of the types StartElement or EndElement, their Name property can be examined to see what its "Space" and "Local" fields are. - Ignore the XML namespace prefixes completely. In your case this appears to be the simplest solution as the names of the elements appear to be unique anyway. The variant which checks for errors, ignores the XML namespace prefixes and also defines the field named "XMLName" on the type to check the name of the element it's supposed to unmarshal can be implemented as follows: --------------------------------8<-------------------------------- package main import ( "encoding/xml" "log" ) type TrackInformation struct { XMLName struct{} `xml:"trackInformation"` TimeAtPosition string `xml:"timeAtPosition"` Speed int `xml:"speed"` DepApt string `xml:"qualifiedAircraftId>departurePoint>airport"` ArrApt string `xml:"qualifiedAircraftId>arrivalPoint>airport"` Gufi string `xml:"qualifiedAircraftId>gufi"` } func main() { xmlToParse := ` <fdm:trackInformation> <nxcm:qualifiedAircraftId> <nxce:aircraftId>TEST</nxce:aircraftId> <nxce:gufi>KR32642300</nxce:gufi> <nxce:departurePoint> <nxce:airport>KJFK</nxce:airport> </nxce:departurePoint> <nxce:arrivalPoint> <nxce:airport>KJFK</nxce:airport> </nxce:arrivalPoint> </nxcm:qualifiedAircraftId> <nxcm:speed>245</nxcm:speed> <nxcm:timeAtPosition>2017-11-07T11:20:43Z</nxcm:timeAtPosition> </fdm:trackInformation>` var trackInfo TrackInformation err := xml.Unmarshal([]byte(xmlToParse), &trackInfo) if err != nil { log.Fatal(err) } log.Println(trackInfo) } --------------------------------8<-------------------------------- Playground [1]. A couple of more notes. - You can't use namespaces when defining the names of the nested elements. The wording of the documentation is a bit moot but it does explicitly state this: «If the XML element contains a sub-element whose name matches the prefix of a tag formatted as "a" or "a>b>c"…» — notice that "the prefix of a tag" bit which actually means "the local name of an element". So when you need to match on full names of the elements, you'd have to use nested structs so that each field stands for an element without nesting, and the nesting is defined via your types rather than tags on their fields. - The XML decoder implements a "strict" mode, which is "on" by default. What's interesting about it is that even when it's on, it turns a blind eye on undefined XML namespace prefixes: «Strict mode does not enforce the requirements of the XML name spaces TR. In particular it does not reject name space tags using undefined prefixes. Such tags are recorded with the unknown prefix as the name space URL.» This means that you can use your undefined namespace prefixes "as is" when decoding. [2] demonstrates this approach applied to the top-level XML elements. You can't do this for that "a>b>c" notation in the tags but you still can apply it when implementing parsing using the nested types. - Another trick up the sleeve of the XML decoder is support for custom unmarshaling functions for your custom types. Any of your types (such as TrackInformation) can implement a function UnmarshalXML(d *xml.Decoder, start xml.StartElement) error to make that type implement the encoding/xml.Unmarshaler interface. When the decoder sees a type implements this interface, it calls the UnmarshalXML function instead of dealing with the element's contents itself. What follows, is that you can have a hierarchy of low-level unexported types and a top-level "facade" type defining UnmarshalXML which internally first unmarshals the element using that hierarchy of types and then populates your "facade" type with the information ended up in that hierarchy of values. Hope this helps. 1. https://play.golang.org/p/KJvvWg9apu 2. https://play.golang.org/p/AR5vDTKX0Q -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.