Re: [basex-talk] TagSoup and html5 support

2016-12-21 Thread Alexander Shpack
On Wed, Dec 21, 2016 at 3:46 PM, Christian Grün 
wrote:

> > Nope, I don't. It should be implemented in basex server, we are using
> restxq
> > functionality.
>
> If we find working and light-weight alternatives, we could replace the
> original distribution of TagSoup with the new solution. Suggestions
> are welcome.
>

I think TagSoup is good enough. But it requires some html5 tuning. I know
any library that is lightweight and has the same feature list as TagSoup.

-- 
s0rr0w


Re: [basex-talk] How to save Table views as spreadsheet

2016-12-21 Thread Marc van Grootel
Hi Constantine,

Instead of with POI do it with standard XQuery and XSLT. Here's the
gist (pun intended) of it
https://gist.github.com/xokomola/59a590a423b86bb3ea809c1b03706c4b
Haven't checked it for correctness but gives you an idea how it works.

--Marc

On Wed, Dec 14, 2016 at 9:15 AM, Hondros, Constantine (ELS-AMS)
 wrote:
> Hi there,
>
> Unless I'm mistaken, I think the easiest thing to do is to serialise as CSV 
> and import into something like Excel.
>
> The CSV module can help you here: http://docs.basex.org/wiki/CSV_Module
>
> Although thinking about it, perhaps other folks have tried wrapping a library 
> like Apache POI (https://poi.apache.org/) so that spreadsheets can be created 
> directly from XQuery?
>
> Constantine
>
> -Original Message-
> From: basex-talk-boun...@mailman.uni-konstanz.de 
> [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of 
> kalameg...@gmail.com
> Sent: 14 December 2016 00:10
> To: basex-talk@mailman.uni-konstanz.de
> Subject: [basex-talk] How to save Table views as spreadsheet
>
> I am new to BaseX which I find as a powerful tool to work on xml data. The 
> BaseX GUI  is great. My question is:  how can I save the Tables in the 
> visualization as spreadsheet(s).  I understand there are multiple views but 
> it should be possible to save them as separate tables or tabs in a single 
> spreadsheet.
>
> I apologize if this has been addresses in an earlier talk.
> I would like to know if there is a way to quickly search the base-talk 
> archives without having to look through them all (arranged by month).
>
> Thanks!   -R. Kalamegham
>
> 
>
> Elsevier B.V. Registered Office: Radarweg 29, 1043 NX Amsterdam, The 
> Netherlands, Registration No. 33156677, Registered in The Netherlands.



-- 
--Marc


Re: [basex-talk] TagSoup and html5 support

2016-12-21 Thread George Sofianos

Interesting. Is it possible to use it for converting HTML to XML?
I'm not really sure about that. It looks like it parses HTML into a DOM 
document object so I'm not sure if this can work with BaseX.

I see. So it probably sends requests headers like "Accept-Encoding:
x-compress; x-zip" to the server and unzips the result, is this right?
Yes, It sends the request with Accept-Encoding for gzip, retrieves the 
gzipped response, and then it unzips the content into a stream.

I don’t know much about HTTP caching so far, though.
HttpClient has support for some caching libraries, which means it can 
download the XML files into a custom disk storage, then just check if 
they have changed in every document request. In case the file hasn't 
changed on the server that supports HTTP caching, a 304 response is 
returned to the client, so it doesn't need to download the file a second 
time.


Re: [basex-talk] TagSoup and html5 support

2016-12-21 Thread Christian Grün
> In our projects though, we are using https://jsoup.org/ and it works well,
> also very easy to use.

Interesting. Is it possible to use it for converting HTML to XML?

> I'm talking about calls that happend using XQuery
> doc(http://randomhost.rn/random.xml).

I see. So it probably sends requests headers like "Accept-Encoding:
x-compress; x-zip" to the server and unzips the result, is this right?

Maybe we could easily realize something similar in BaseX without an
additional library, at least for (g)zipped streams. (because I still
try to keep the BaseX distribution as small as possible...). There is
already an existing issue for that [1]. I don’t know much about HTTP
caching so far, though.

Cheers,
Christian

[1] https://github.com/BaseXdb/basex/issues/1381


 I'm not sure if they request gzipped
> files. I think I've tested it once and it didn't.
> For example trying to get a 233MB XML file using gzip compression, will only
> need to fetch 27.8MB (this is a random file, the compression may vary for
> different XML files). We are working with files that can be over 1GB, so it
> can make a difference in bandwidth and execution (compilation) time.


Re: [basex-talk] TagSoup and html5 support

2016-12-21 Thread George Sofianos

Would it also help us converting HTML5, or it is a general suggestion? ;)

Unfortunately no, it was a general suggestion :(
In our projects though, we are using https://jsoup.org/ and it works 
well, also very easy to use. I still prefer XPath over the CSS selectors.

Out of interest: Where would this come into play? When using
http:send-request, or also at other places?
I'm talking about calls that happend using XQuery 
doc(http://randomhost.rn/random.xml). I'm not sure if they request 
gzipped files. I think I've tested it once and it didn't.
For example trying to get a 233MB XML file using gzip compression, will 
only need to fetch 27.8MB (this is a random file, the compression may 
vary for different XML files). We are working with files that can be 
over 1GB, so it can make a difference in bandwidth and execution 
(compilation) time.


Re: [basex-talk] TagSoup and html5 support

2016-12-21 Thread Christian Grün
> Speaking about suggestions, how do you feel about adding Apache HttpClient
> to BaseX?

Would it also help us converting HTML5, or it is a general suggestion? ;)

> It can help with requesting gzipped XML files (which makes huge
> difference in large XML files), and could possibly use the http cache
> mechanism.

Out of interest: Where would this come into play? When using
http:send-request, or also at other places?


Re: [basex-talk] Tail Recursion Error on startup

2016-12-21 Thread Christian Grün
> I'm curious, what value do you recommend here? I've been using it for BaseX
> with -Xss4m for a long time, but I'm sure that is too much.

Hm, good question ;) I think that the Java default setting (…which
also depends on your system configuration) is usually the best
tradeoff. In our own apps, we usually rewrite our XQuery code such
that there is no need for this flag (mostly because of convenience, to
ensure that it runs out-of-the-box when changing the system). If you
don’t experience any bottlenecks with 4m that you don’t encounter with
a smaller value, it’s probably a good choice.

Thanks,
Christian


Re: [basex-talk] TagSoup and html5 support

2016-12-21 Thread George Sofianos

If we find working and light-weight alternatives, we could replace the
original distribution of TagSoup with the new solution. Suggestions
are welcome.
Speaking about suggestions, how do you feel about adding Apache 
HttpClient to BaseX? It can help with requesting gzipped XML files 
(which makes huge difference in large XML files), and could possibly use 
the http cache mechanism.


Regards,
George


Re: [basex-talk] Tail Recursion Error on startup

2016-12-21 Thread George Sofianos



I know some of you are waiting for BaseX 8.6, and I promised to make
it happen until end of this year. I am afraid we won’t make it in
time, but you can definitely expect the new version in January!

Thanks again for all the work.


The problem itself is a generic one and not limited to BaseX or
XQuery. Did you try rewrite your function calls to tail calls [1]?
Another alternative is to increase the stack size of Java (via the
-Xss flag).
I'm curious, what value do you recommend here? I've been using it for 
BaseX with -Xss4m for a long time, but I'm sure that is too much.


Re: [basex-talk] TagSoup and html5 support

2016-12-21 Thread Christian Grün
> Nope, I don't. It should be implemented in basex server, we are using restxq
> functionality.

If we find working and light-weight alternatives, we could replace the
original distribution of TagSoup with the new solution. Suggestions
are welcome.


Re: [basex-talk] TagSoup and html5 support

2016-12-21 Thread Alexander Shpack
On Wed, Dec 21, 2016 at 3:33 PM, Christian Grün 
wrote:

> Hi Alex,
>
> Currently, there is no alternative I have in mind. As I assume that
> the original author of TagSoup has stopped development quite a while
> ago, it could indeed be interesting to find alternatives or extended
> versions of the original TagSoup code. Have you already tried the
> project you’ve been quoting in your e-mail?
>
>
Nope, I don't. It should be implemented in basex server, we are using
restxq functionality.


Re: [basex-talk] Tail Recursion Error on startup

2016-12-21 Thread Christian Grün
> Any thoughts on why it would clear up after 3-4 attempts of running the
> query?

As I tried to indicate, it could be Java’s Just In Time Compiler that
evaluates code differently after the initial warmup phase. After all,
that’s just a guess. It also depends on the version of Java that you
are using, and the distribution (OpenJDK vs. Oracle).


> 
> From: Christian Grün 
> To: buddyonweb-software 
> Cc: BaseX Talk 
> Sent: Wednesday, December 21, 2016 8:31 AM
> Subject: Re: [basex-talk] Tail Recursion Error on startup
>
> Hi Buddy,
>
>> If we attempt to execute it again (sometimes twice, but have done so
>> repeatedly up to 5 times), the error goes away and all is fine.
>
> Difficult to say why this happens. Maybe your code is rewritten by
> Java’s JIT compiler.
>
>
>> Is there anything you can suggest or anything we can provide you to see
>> how
>> to eliminate this.  This is particularly problematic on our server when we
>> reboot our production application.
>> Anything you can suggest or anyway we can help is much appreciated.
>
>
> The problem itself is a generic one and not limited to BaseX or
> XQuery. Did you try rewrite your function calls to tail calls [1]?
> Another alternative is to increase the stack size of Java (via the
> -Xss flag).
>
> Hope this helps,
> Christian
>
> [1] https://en.wikipedia.org/wiki/Tail_call
>


Re: [basex-talk] Tail Recursion Error on startup

2016-12-21 Thread buddyonweb-software
Thanks.
Any thoughts on why it would clear up after 3-4 attempts of running the query?

  From: Christian Grün 
 To: buddyonweb-software  
Cc: BaseX Talk 
 Sent: Wednesday, December 21, 2016 8:31 AM
 Subject: Re: [basex-talk] Tail Recursion Error on startup
   
Hi Buddy,

> If we attempt to execute it again (sometimes twice, but have done so
> repeatedly up to 5 times), the error goes away and all is fine.

Difficult to say why this happens. Maybe your code is rewritten by
Java’s JIT compiler.

> Is there anything you can suggest or anything we can provide you to see how
> to eliminate this.  This is particularly problematic on our server when we
> reboot our production application.
> Anything you can suggest or anyway we can help is much appreciated.

The problem itself is a generic one and not limited to BaseX or
XQuery. Did you try rewrite your function calls to tail calls [1]?
Another alternative is to increase the stack size of Java (via the
-Xss flag).

Hope this helps,
Christian

[1] https://en.wikipedia.org/wiki/Tail_call

   

Re: [basex-talk] TagSoup and html5 support

2016-12-21 Thread Christian Grün
Hi Alex,

Currently, there is no alternative I have in mind. As I assume that
the original author of TagSoup has stopped development quite a while
ago, it could indeed be interesting to find alternatives or extended
versions of the original TagSoup code. Have you already tried the
project you’ve been quoting in your e-mail?

Cheers,
Christian



> As you know, TagSoup 1.2.1 doesn't support correct HTML5 tag nesting. For
> example, this string
> Test
> will be parsed as
> 
>
> But html5 code
> Test
> will be parsed as is.
>
> How to implement in BaseX the html5 support?
> I've found this project, but not sure that it's possible to add it into
> basex code
> https://github.com/UniversityofWarwick/tagsoup-html5
>
> Thanks!


Re: [basex-talk] Tail Recursion Error on startup

2016-12-21 Thread Christian Grün
Hi Buddy,

> If we attempt to execute it again (sometimes twice, but have done so
> repeatedly up to 5 times), the error goes away and all is fine.

Difficult to say why this happens. Maybe your code is rewritten by
Java’s JIT compiler.

> Is there anything you can suggest or anything we can provide you to see how
> to eliminate this.  This is particularly problematic on our server when we
> reboot our production application.
> Anything you can suggest or anyway we can help is much appreciated.

The problem itself is a generic one and not limited to BaseX or
XQuery. Did you try rewrite your function calls to tail calls [1]?
Another alternative is to increase the stack size of Java (via the
-Xss flag).

Hope this helps,
Christian

[1] https://en.wikipedia.org/wiki/Tail_call


[basex-talk] Tail Recursion Error on startup

2016-12-21 Thread buddyonweb-software
When we start up BaseX (whether the server or the GUI) and run a particular a 
query against a particular .xqm file, we always get a tail recursion error.
Error:[bxerr:BASX0005] Stack Overflow: Try tail recursion?
If we attempt to execute it again (sometimes twice, but have done so repeatedly 
up to 5 times), the error goes away and all is fine.
Is there anything you can suggest or anything we can provide you to see how to 
eliminate this.  This is particularly problematic on our server when we reboot 
our production application.Anything you can suggest or anyway we can help is 
much appreciated.

Buddy

[basex-talk] TagSoup and html5 support

2016-12-21 Thread Alexander Shpack
Hi team!

As you know, TagSoup 1.2.1 doesn't support correct HTML5 tag nesting. For
example, this string
Test
will be parsed as


But html5 code
Test
will be parsed as is.

How to implement in BaseX the html5 support?
I've found this project, but not sure that it's possible to add it into basex
code
https://github.com/UniversityofWarwick/tagsoup-html5

Thanks!