Re: PR for VXQUERY-67

Till Westmann Wed, 30 Mar 2016 23:17:25 -0700

Hi Riyafa,

On 27 Mar 2016, at 23:50, Riyafa Abdul Hameed wrote:

I modified the pull request so that I was able to fix most of theprevious
errors bringing up kind of different errors:
Consider the function fn:tokenize. According to the defininitiion [1]it
should return a sequence of strings
*Definition:*

fn:tokenize($input as xs:string?, $pattern as xs:string) as xs:string*
fn:tokenize( $input  as xs:string?, $pattern  as xs:string, $flags  as
xs:string) as xs:string*
where xs: string* represents a sequence of strings. Earlier having notreadthis definition properly I returned single string with each tokenizedpartseparated by a single space--all the results did not pass because thetests
were expecting a sequence.
Now I have modified the code to return a sequence of string for theabovefunction, but now most of the tests fail. I have found the reason forthis
by remote debugging the org.apache.vxquery.result.ResultUtils class:
The result tested by test suite requires the final result to be asequenceof strings separated by a single white space, but the string generatedinthe ResultUtils class has a sequence of strings separated by the newline
(\n) character because of which tests on tokenize fail.

for an example consider the test (test  14015 in [2]):

fn:tokenize("The cat sat on the mat", "\s+")

The expected result is:
The cat sat on the mat

But the string generated (at ResultUtils) and printed on the console

The
cat
sat
on
the
mat

each string separated by a new line character.

For the same reason other tests also fail (Eg: 13950 in [2]).

Shall I create an issue in jira so that a fix could be made by which
instead of using a new line character to separate the values in asequence
when printing to the console a single whitespace would be used?

Yes, you are right, this is indeed serialization problem (i.e. a problemof

how we serialize instances of the XQuery Data Model) and it should be
captured in a JIRA. However, I am not sure that we always want to move a

single space. I think that the issue should state that we need to findoutwhere the new line is introduced and that we need to discuss/decide inwhich

cases we want a single-space and in which one a new line is preferable.

Also I am not sure I understand your instruction on how to make a PRfor thechange in the materialized results. What do you mean by "checking thatonly
the order has changed"? (sorry if it is a silly question)


On the website we have instructions how to generate the XQTS results.

However, the results that are currently checked in are not in the orderthatthey would be in if you follow the instructions. The instructions tellus tosort the result using "sort", while the checked in results were sortedusing

"sort -V". While using "sort -V" creates a result that’s more sensibly

sorted, the "-V" option is not available in "sort" on all platforms(e.g. itis not available on OS X). So I think that we should move back tosortingthe results with the plain "sort" to ensure that everybody can updateand

compare the results. [1]

My proposal was that you could
1) take the current master branch,
2) run the tests,
3) sort the results of the tests with "sort",
4) sort the checked in results with "sort",
5) verify that the sorted results from 3) and 4) are identical, and

6) create a new pull request to update the now differently sortedreference

   results.

If 5) succeeds you will have done a reasonable check that only the orderof

the reference results has changed.

If all of this works you will have

a) validated that you indeed get the expected reference results beforeyour

   fix and
b) created a pull request for reference results in a form that make the
   comparison with the results that you'll get after your fix easier.

I think that the big diff [2] that you see in the materialized reference

results right now is due to the changes introduced by your change andthe

changes introduced by the different sort order.

Does this make sense?

Cheers,
Till

[1] https://issues.apache.org/jira/browse/VXQUERY-187

[2]https://github.com/apache/vxquery/pull/32/commits/131915a2bb38b06e6ef2d27a24c50201d1dab13c

Please kindly help.


[1] https://www.w3.org/TR/xpath-functions/#func-tokenize

[2]
http://riyafa.github.io/Riyafa-Abdul-Hameed--web-page/others/full_report.html

Thank you.

Yours sincerely,
Riyafa

On 16 March 2016 at 10:59, Till Westmann <[email protected]> wrote:
Hi,
I took a brief look into you change and here are a few next stepsthat
could
help to get a better handle on the issue:
1) One of the problems with the diff for the expected results is,that theinstructions to create the diff that you find on the website arenotconsistent with the current reality [1]. So one good step would betoa) recreate the expected results with an unmodified checkoutfollowing
      the instructions on the website and
   b) checking that only the order has changed, and
   c) creating a PR for that.
2) Rerun the tests with your patch, categorize the failures by stacktrace,and explain at least one failure in more detail on the list (withastack trace, pointers to the code an possible explanations - ifthey
   come to mind). E.g. it would be good to see in the e-mail what the
system did when the creation of a sequence for fn:tokenizedidn’t work.
Does this make sense?

Cheers,
Till

[1] https://issues.apache.org/jira/browse/VXQUERY-187


On 11 Mar 2016, at 18:11, Riyafa Abdul Hameed wrote:

Hi,
I tried fixing the errors from the test results and I was unable tofixsome of them. You can find the full error report here[1]. The testcases
related this PR are from 13865 to 14054.
There are errors related to exception handling and since I am usingtheavailable java functions I am not sure how I could catch sucherrors.Also I don't seem to be matching UTF-8 strings, I tried to get thebyte
array and convert to UTF-8 string, but it wouldn't work.
Related errors are: 13918 to 13921.
According to [2] I think we should convert all the UTF-8 charactersasappropriate when adding to a StringBuilder in theUTF8StringPointable
class.. I am not sure how I could do that.
Also I tried converting the result of fn:tokenize to a sequence ofstrings
(using sequence builder) instead of a single string, but in vain.
Maybe I have understood things incorrectly. Can you please help mefigure
out how I could fix these errors?
(I sent a previous mail which was not delivered because I tried toattach
the error report)

[1]

http://riyafa.github.io/Riyafa-Abdul-Hameed--web-page/others/full_report.html
[2] http://stackoverflow.com/a/5729843/3599535

Thank you.

Yours sincerely,
Riyafa

On 10 March 2016 at 14:04, Till Westmann <[email protected]> wrote:

Hi Riyafa,
I just looked at your PR [1] and realized that the diff in theresults
file is very big.
I think that this might be due to a recent commit by Preston [2]that
changed the sorting of the results file a bit.
Could you take a look if that’s indeed the case and - if so -create a
new
results file with the same order that’s currently checked it?
Otherwise, could you validate, that queries that use the newfunctions
work correctly now?

Cheers,
Till

[1] https://github.com/apache/vxquery/pull/32/
[2]

https://github.com/apache/vxquery/commit/43852a5476ccb33bf9ee58e27468b400cc169d6a#diff-39476c050696c8ab9f59540b607ba92e
--
Riyafa Abdul Hameed
Undergraduate, University of Moratuwa

Email: [email protected]
Website: https://riyafa.wordpress.com/<http://riyafa.wordpress.com/>
<http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
<http://twitter.com/Riyafa1>
--
Riyafa Abdul Hameed
Undergraduate, University of Moratuwa

Email: [email protected]
Website: https://riyafa.wordpress.com/ <http://riyafa.wordpress.com/>
<http://facebook.com/riyafa.ahf>  <http://lk.linkedin.com/in/riyafa>
<http://twitter.com/Riyafa1>

Re: PR for VXQUERY-67

Reply via email to