Re: [basex-talk] Different results when importing HTML documents

2023-02-02 Thread Christian Grün
Hi Tim,

A link to the TagSoup documentation already exists some lines further
above in the article.

I have slightly changed the text and the example, I hope this makes it
less confusing.

Cheers,
Christian



On Thu, Feb 2, 2023 at 6:32 PM Timothée  wrote:
>
> Hello Christian,
>
> Thanks for your reply. I added the documents again with the default options 
> and I do get satisfying results. Not sure why I kept on using the settings 
> recommended in the documentation...
> Would it be possible to add the tagsoup documentation link about the parser 
> options to the BaseX doc? That could be helpful.
>
> Thanks,
> - Tim
>
> On Mon, Jan 30, 2023 at 10:42 PM Christian Grün  
> wrote:
>>
>> Hi Tim,
>>
>> I assume the article element will be preserved if you omit the
>> nobogons HTMLPARSER option [1]. Usually, there’s no need to set
>> specific options if the default behavior gives satisfying results.
>>
>> Best,
>> Christian
>>
>> [1] http://vrici.lojban.org/~cowan/tagsoup/
>>
>>
>>
>> On Fri, Jan 27, 2023 at 8:05 PM Timothée  wrote:
>> >
>> > Hello all,
>> >
>> > I am trying to store HTML documents in BaseX. I setup a local instance of 
>> > BaseX on my computer using Docker, and I imported this file in it: 
>> > https://pastebin.com/HJdJgLv9
>> >
>> > On my local BaseX instance, the document is imported and 
>> > "/html/body/article" does return the  node as expected.
>> >
>> > On my remote/production BaseX instance (using the same Dockerfile and 
>> > image), the document is imported but the  tag is "stripped" (even 
>> > though its contents / child nodes remain in the imported document). 
>> > "/html/body/article" is empty.
>> >
>> > If I copy over the .basex files from my local database to my remote 
>> > database, then the documents are complete like on my local instance. I 
>> > also tried to import the documents again on my local instance, and the 
>> >  tag gets stripped too (and the child nodes remain).
>> >
>> > What am I doing wrong when importing my documents? What did I do to import 
>> > them properly in my current local instance? I tried a lot of options but I 
>> > just can't figure out why this happens (I fiddled a lot with it).
>> >
>> > I used the following options when importing my documents, as per the 
>> > documentation:
>> > SET PARSER html
>> > SET HTMLPARSER 
>> > method=xml,nons=true,nocdata=true,nodefaults=true,nobogons=true,nocolons=true,ignorable=true
>> > SET CREATEFILTER *.html
>> >
>> > I also use SET FTINDEX true but I don't think it would have an impact 
>> > anyway.
>> >
>> > Thank you very much!
>> > - Tim


Re: [basex-talk] Different results when importing HTML documents

2023-02-02 Thread Timothée
Hello Christian,

Thanks for your reply. I added the documents again with the default options
and I do get satisfying results. Not sure why I kept on using the settings
recommended in the documentation...
Would it be possible to add the tagsoup documentation link about the parser
options to the BaseX doc? That could be helpful.

Thanks,
- Tim

On Mon, Jan 30, 2023 at 10:42 PM Christian Grün 
wrote:

> Hi Tim,
>
> I assume the article element will be preserved if you omit the
> nobogons HTMLPARSER option [1]. Usually, there’s no need to set
> specific options if the default behavior gives satisfying results.
>
> Best,
> Christian
>
> [1] http://vrici.lojban.org/~cowan/tagsoup/
>
>
>
> On Fri, Jan 27, 2023 at 8:05 PM Timothée  wrote:
> >
> > Hello all,
> >
> > I am trying to store HTML documents in BaseX. I setup a local instance
> of BaseX on my computer using Docker, and I imported this file in it:
> https://pastebin.com/HJdJgLv9
> >
> > On my local BaseX instance, the document is imported and
> "/html/body/article" does return the  node as expected.
> >
> > On my remote/production BaseX instance (using the same Dockerfile and
> image), the document is imported but the  tag is "stripped" (even
> though its contents / child nodes remain in the imported document).
> "/html/body/article" is empty.
> >
> > If I copy over the .basex files from my local database to my remote
> database, then the documents are complete like on my local instance. I also
> tried to import the documents again on my local instance, and the 
> tag gets stripped too (and the child nodes remain).
> >
> > What am I doing wrong when importing my documents? What did I do to
> import them properly in my current local instance? I tried a lot of options
> but I just can't figure out why this happens (I fiddled a lot with it).
> >
> > I used the following options when importing my documents, as per the
> documentation:
> > SET PARSER html
> > SET HTMLPARSER
> method=xml,nons=true,nocdata=true,nodefaults=true,nobogons=true,nocolons=true,ignorable=true
> > SET CREATEFILTER *.html
> >
> > I also use SET FTINDEX true but I don't think it would have an impact
> anyway.
> >
> > Thank you very much!
> > - Tim
>


Re: [basex-talk] Reducing logs

2023-02-02 Thread Eliot Kimber
For me, being able to log HTTP requests separately (different log file) from 
XQuery-generated would be very helpful. For our Mirabel server we’ll want to be 
able to mine usage statistics from the HTTP request logs but for general 
administration and monitoring, it’s the messages generated by the XQuery code 
serving the pages that are useful.

Cheers,

E.

_
Eliot Kimber
Sr Staff Content Engineer
O: 512 554 9368
M: 512 554 9368
servicenow.com
LinkedIn | 
Twitter | 
YouTube | 
Facebook

From: BaseX-Talk  on behalf of 
Christian Grün 
Date: Thursday, February 2, 2023 at 7:42 AM
To: Marco Lettere 
Cc: basex-talk@mailman.uni-konstanz.de 
Subject: Re: [basex-talk] Reducing logs
[External Email]

Hi Marco,

As this requirement has been reported back to us repeatedly, it’s time
to create an issue for it [1].

I think we’ll replace the general LOG option with a more flexible
variant. We’ll still need to define which levels to offers; any
feedback is welcome. The new option will probably be made available
with BaseX 11 (to be expected around spring).

Ciao,
Christian

[1] 
https://github.com/BaseXdb/basex/issues/2168




On Thu, Feb 2, 2023 at 1:37 PM Marco Lettere  wrote:
>
> Dear all,
>
> when using the Basex Xquery server (9.7.3) we see that logs such as [1]
> are produced for every query execution.
>
> Since we have a scenario which involves polling and thus we have a lot
> of query executions the log files increase to hundreds om MBs during a day.
>
> This makes them unmanageable and even cause dba UI to crash with OOM
> exceptions when opening the logs page.
>
> Is there a way to disable this logging (without affecting other logs
> such as HTTP)?
>
> Is it really necessary to log server side query executions at this grain
> by default? Maybe it could be made optional?
>
> Thanks for any support.
>
> Regards,
>
> Marco.
>
> [1]
> 12:22:27.849 10.0.4.15:50424 admin OK 0.02 CLOSE[0]
> 12:22:27.848 10.0.4.15:50434 admin OK 1.40 FULL[0]
> 12:22:27.848 10.0.4.15:50424 admin OK 0.83 FULL[0]
> 12:22:27.847 10.0.4.15:50420 admin OK 0.05 CLOSE[0]
> 12:22:27.847 10.0.4.15:50412 admin OK 0.04 CLOSE[0]
> 12:22:27.847 10.0.4.15:50434 admin OK 0.05 BIND[0]
> db=infrastructures as xs:string
> 12:22:27.846 10.0.4.15:50410 admin OK 0.22 CLOSE[0]
> 12:22:27.846 10.0.4.15:50434 admin OK 0.06 BIND[0]
> id=ontheroad-lxd as xs:string
> 12:22:27.846 10.0.4.15:50424 admin OK 0.05 BIND[0]
> db=infrastructures as xs:string
> 12:22:27.846 10.0.4.15:50420 admin OK 0.62 FULL[0]
> 12:22:27.846 10.0.4.15:50412 admin OK 0.90 FULL[0]
> 12:22:27.846 10.0.4.15:50424 admin OK 0.06 BIND[0]
> id=ontheroad-lxd as xs:string
> 12:22:27.845 10.0.4.15:50380 admin OK 0.02 CLOSE[0]
> 12:22:27.845 10.0.4.15:50366 admin OK 0.01 CLOSE[0]
> 12:22:27.845 10.0.4.15:50402 admin OK 0.02 CLOSE[0]
> 12:22:27.845 10.0.4.15:50420 admin OK 0.04 BIND[0]
> db=infrastructures as xs:string
> 12:22:27.845 10.0.4.15:50386 admin OK 0.02 CLOSE[0]
> 12:22:27.845 10.0.4.15:50410 admin OK 0.88 FULL[0]
> 12:22:27.845 10.0.4.15:50420 admin OK 0.06 BIND[0]
> id=ontheroad-lxd as xs:string
> 12:22:27.845 10.0.4.15:50412 admin OK 0.04 BIND[0]
> db=infrastructures as xs:string
> 12:22:27.845 10.0.4.15:50434 admin OK 0.04 QUERY[0] declare
> variable $id external; declare variable $db external;
> db:open($db)[json/id = $id]
>


Re: [basex-talk] Reducing logs

2023-02-02 Thread Christian Grün
Hi Marco,

As this requirement has been reported back to us repeatedly, it’s time
to create an issue for it [1].

I think we’ll replace the general LOG option with a more flexible
variant. We’ll still need to define which levels to offers; any
feedback is welcome. The new option will probably be made available
with BaseX 11 (to be expected around spring).

Ciao,
Christian

[1] https://github.com/BaseXdb/basex/issues/2168




On Thu, Feb 2, 2023 at 1:37 PM Marco Lettere  wrote:
>
> Dear all,
>
> when using the Basex Xquery server (9.7.3) we see that logs such as [1]
> are produced for every query execution.
>
> Since we have a scenario which involves polling and thus we have a lot
> of query executions the log files increase to hundreds om MBs during a day.
>
> This makes them unmanageable and even cause dba UI to crash with OOM
> exceptions when opening the logs page.
>
> Is there a way to disable this logging (without affecting other logs
> such as HTTP)?
>
> Is it really necessary to log server side query executions at this grain
> by default? Maybe it could be made optional?
>
> Thanks for any support.
>
> Regards,
>
> Marco.
>
> [1]
> 12:22:27.84910.0.4.15:50424admin OK0.02CLOSE[0]
> 12:22:27.84810.0.4.15:50434adminOK1.40 FULL[0]
> 12:22:27.84810.0.4.15:50424adminOK0.83 FULL[0]
> 12:22:27.84710.0.4.15:50420adminOK0.05 CLOSE[0]
> 12:22:27.84710.0.4.15:50412adminOK0.04 CLOSE[0]
> 12:22:27.84710.0.4.15:50434adminOK0.05 BIND[0]
> db=infrastructures as xs:string
> 12:22:27.84610.0.4.15:50410adminOK0.22 CLOSE[0]
> 12:22:27.84610.0.4.15:50434adminOK0.06 BIND[0]
> id=ontheroad-lxd as xs:string
> 12:22:27.84610.0.4.15:50424adminOK0.05 BIND[0]
> db=infrastructures as xs:string
> 12:22:27.84610.0.4.15:50420adminOK0.62 FULL[0]
> 12:22:27.84610.0.4.15:50412adminOK0.90 FULL[0]
> 12:22:27.84610.0.4.15:50424adminOK0.06 BIND[0]
> id=ontheroad-lxd as xs:string
> 12:22:27.84510.0.4.15:50380adminOK0.02 CLOSE[0]
> 12:22:27.84510.0.4.15:50366adminOK0.01 CLOSE[0]
> 12:22:27.84510.0.4.15:50402adminOK0.02 CLOSE[0]
> 12:22:27.84510.0.4.15:50420adminOK0.04 BIND[0]
> db=infrastructures as xs:string
> 12:22:27.84510.0.4.15:50386adminOK0.02 CLOSE[0]
> 12:22:27.84510.0.4.15:50410adminOK0.88 FULL[0]
> 12:22:27.84510.0.4.15:50420adminOK0.06 BIND[0]
> id=ontheroad-lxd as xs:string
> 12:22:27.84510.0.4.15:50412adminOK0.04 BIND[0]
> db=infrastructures as xs:string
> 12:22:27.84510.0.4.15:50434adminOK0.04 QUERY[0] declare
> variable $id external; declare variable $db external;
> db:open($db)[json/id = $id]
>


[basex-talk] Reducing logs

2023-02-02 Thread Marco Lettere

Dear all,

when using the Basex Xquery server (9.7.3) we see that logs such as [1] 
are produced for every query execution.


Since we have a scenario which involves polling and thus we have a lot 
of query executions the log files increase to hundreds om MBs during a day.


This makes them unmanageable and even cause dba UI to crash with OOM 
exceptions when opening the logs page.


Is there a way to disable this logging (without affecting other logs 
such as HTTP)?


Is it really necessary to log server side query executions at this grain 
by default? Maybe it could be made optional?


Thanks for any support.

Regards,

Marco.

[1]
12:22:27.849    10.0.4.15:50424    admin OK    0.02    CLOSE[0]
12:22:27.848    10.0.4.15:50434    admin    OK    1.40 FULL[0]
12:22:27.848    10.0.4.15:50424    admin    OK    0.83 FULL[0]
12:22:27.847    10.0.4.15:50420    admin    OK    0.05 CLOSE[0]
12:22:27.847    10.0.4.15:50412    admin    OK    0.04 CLOSE[0]
12:22:27.847    10.0.4.15:50434    admin    OK    0.05 BIND[0] 
db=infrastructures as xs:string

12:22:27.846    10.0.4.15:50410    admin    OK    0.22 CLOSE[0]
12:22:27.846    10.0.4.15:50434    admin    OK    0.06 BIND[0] 
id=ontheroad-lxd as xs:string
12:22:27.846    10.0.4.15:50424    admin    OK    0.05 BIND[0] 
db=infrastructures as xs:string

12:22:27.846    10.0.4.15:50420    admin    OK    0.62 FULL[0]
12:22:27.846    10.0.4.15:50412    admin    OK    0.90 FULL[0]
12:22:27.846    10.0.4.15:50424    admin    OK    0.06 BIND[0] 
id=ontheroad-lxd as xs:string

12:22:27.845    10.0.4.15:50380    admin    OK    0.02 CLOSE[0]
12:22:27.845    10.0.4.15:50366    admin    OK    0.01 CLOSE[0]
12:22:27.845    10.0.4.15:50402    admin    OK    0.02 CLOSE[0]
12:22:27.845    10.0.4.15:50420    admin    OK    0.04 BIND[0] 
db=infrastructures as xs:string

12:22:27.845    10.0.4.15:50386    admin    OK    0.02 CLOSE[0]
12:22:27.845    10.0.4.15:50410    admin    OK    0.88 FULL[0]
12:22:27.845    10.0.4.15:50420    admin    OK    0.06 BIND[0] 
id=ontheroad-lxd as xs:string
12:22:27.845    10.0.4.15:50412    admin    OK    0.04 BIND[0] 
db=infrastructures as xs:string
12:22:27.845    10.0.4.15:50434    admin    OK    0.04 QUERY[0] declare 
variable $id external; declare variable $db external; 
db:open($db)[json/id = $id]