There are many people asking for various fixes to Daffodil.

You can certainly get the priority of a particular fix/feature increased by
advocating for it on dev or users mailing lists.

But,... (my recruiting hat on now... you know what is coming...) you could
also contribute the fix (and a test case).

Daffodil is open source.

Fixing the text->number constructors so that whitespace is tolerated when
creating numeric values is an easy fix.

Adding additional functions is also pretty easy. We have used that as a
learning exercise for new contributors in the past.

We actually consciously try not to rapid-fix such easy things, because
well, we want to encourage more contributors to Daffodil.
That doesn't mean an existing contributor won't fix the issue. It just
means we will attempt to recruit you to contribute the fix yourself first.
(i.e., this email :-)

And as far as the DFDL language standard goes. That workgroup now operates
on the principle where experience with an implementation of a new
capability must first be documented before it can be incorporated into the
standard.












On Fri, Mar 18, 2022 at 6:50 AM Attila Horvath <attila.j.horv...@gmail.com>
wrote:

> Mike
>
> Appreciate the suggested workaround. I did incorporate/test your snippet
> per Mar 16, 2022 at 12:21 PM [below] w/ following anticipated results:
>
> [1]Satellite numbers w/ leading whitespace(s) yields lossless unparse
> results.
> [2]Satellite numbers w/ leading zero(s) or whitespace(s)+zero(s) yield
> unparse results that are 'numerically' equivalent |HOWEVER| unparsed target
> ASCII file fails to compare w/ parsed source ASCII file due to
> <<<dfdl:textNumberPattern="####0">>> formatting that trims leading
> irrelevant characters - see attached parse/source and unparse/target files.
> NB "00000" formatting yields opposite results by trimming leading
> whitspace(s).
>
> The issue/concern of 'lossless' parse/unparse processing for our
> organization is fundamental. Our organization has no control over
> customers' [legacy] pre-/post- processes |AND| the format of input data.
> Ergo lossless end-to-end data transformation is essential b/c if/when
> source/target data fail to compare, we're placed in the untenable position
> of explaining differences on case by case basis.
>
> The impetus/urgency for my questions below re: ticket is that producing
> lossless end-to-end data transformation results via string processing with
> suite of XPATH functions is more important/suitable than yielding
> numerically 'equivalent' results.
>
> Thx - Attila
>
> On Fri, Mar 18, 2022 at 5:12 AM Attila Horvath <attila.j.horv...@gmail.com>
> wrote:
>
>> I know you're preparing to release 3.3.0.
>>
>> When do think this issue might be resolved? Which point release are you
>> targeting?
>>
>> On a related subject, Daffodil implements a subset of XPATH function.
>> Might dev-team consider implementing all XPATH functions in lieu of
>> workarounds?
>>
>> Thx in advance - Attila
>>
>> On Wed, Mar 16, 2022 at 12:26 PM Mike Beckerle <mbecke...@apache.org>
>> wrote:
>>
>>> Created https://issues.apache.org/jira/browse/DAFFODIL-2676
>>>
>>> On Wed, Mar 16, 2022 at 12:21 PM Mike Beckerle <mbecke...@apache.org>
>>> wrote:
>>>
>>> > Ok, I found the attachment. Sorry for the delay.
>>> >
>>> > The challenge here is you are thinking the
>>> > xs:unsignedInt(../Line1.02-Satellite) call will tolerate whitespace.
>>> Which
>>> > it seems they do not.
>>> >
>>> > I think this is a Daffodil bug, as the constructors like xs:unsignedInt
>>> > are supposed to work like they do in XPath, and the XPath functions
>>> spec
>>> > says when converting from strings, that whitespace normalization
>>> applies -
>>> > which trims all leading and trailing whitespace. It's less clear
>>> > about whether interior whitespace is collapsed, but definitely
>>> > leading/trailing seem to be trimmed.
>>> >
>>> > So I'll add a JIRA ticket about this.
>>> >
>>> > For how to work around, I suggest parsing the satellite field not as a
>>> > string, but as an unsignedInt from the start.
>>> >
>>> > So like:
>>> >
>>> > <xs:element name="satellite-num-range" type="xs:unsignedInt"
>>> > dfdl:lengthKind="explicit" dfdl:length="5"
>>> >   dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar"
>>> > dfdl:textNumberPadCharacter="%SP;" dfdl:textNumberJustification="right"
>>> >   dfdl:textNumberPattern="####0"/>
>>> >
>>> > I didn't run this, but I think this will remove leading spaces, and add
>>> > leading spaces to your 5 character element.
>>> >
>>> > Another way to express this, since you need only leading padding is
>>> this:
>>> >
>>> > <xs:element name="satellite-num-range" type="xs:unsignedInt"
>>> > dfdl:lengthKind="explicit" dfdl:length="5"
>>> >   dfdl:textNumberPattern="* ####0"/>
>>> >
>>> > In that textNumberPattern the "* " means spaces are the pad character
>>> to
>>> > be used, and when there is no digit for the position of a "#" then the
>>> pad
>>> > character from the pattern (not the textNumberPadCharacter) is used.
>>> >
>>> > Both kinds of padding can be used together E.g., so you could have
>>> number
>>> > text right justified in a fixed-length field of width 6, using "*" to
>>> pad
>>> > to width 5 so that you can get " **123".
>>> >
>>> > <xs:element name="starPadNum" type="xs:unsignedInt"
>>> > dfdl:lengthKind="explicit" dfdl:length="6"
>>> >   dfdl:textTrimKind="padChar" dfdl:textPadKind="padChar"
>>> > dfdl:textNumberPadCharacter="%SP;" dfdl:textNumberJustification="right"
>>> >   dfdl:textNumberPattern="* ####0"/>
>>> >
>>> > I didn't run these, but this is, I believe, how it is supposed to work.
>>> >
>>> >
>>> >
>>>
>>>

Reply via email to