Re: error on tdb2.tdbbackup: "Caused by: org.apache.thrift.protocol.TProtocolException: Unrecognized type 0"

2023-07-13 Thread Jeffrey C. Witt
Hi Andy,

Thanks for your response.

Based on this comment...

It's not looking good for the database if /$/backup is falling. That's a
very simple use of the database.

Do you think the database is corrupted in such a way that it would be
better to just do a complete rebuild?

Many thanks,
jw

On Thu, Jul 13, 2023 at 3:55 PM Andy Seaborne  wrote:

> Hi Jeff,
>
> There were fixes to compaction in 4.6.0.
>
> On 12/07/2023 23:53, Jeffrey C. Witt wrote:
> > Dear List,
> >
> > I ran into an unusual error today when I tried to backup (and also
> compact)
> > my TDB instance.
> >
> > I first encountered the error when trying to compact and backup up using
> > fuseki 4.3.2
> >
> > I ran both:
>
> If you ran them at the same time, you may have trigger the problem that
> was fixed in 4.6.0.
>
> >
> > $ curl -XPOST http://localhost:3030/$/backup/ds
> > $ curl -XPOST http://localhost:3030/$/compact/ds
> >
> > Both of these commands executed for while, filling up disk space, and
> then
> > suddently stopped:
> >
> > Eventually, I ran:
> >
> > $ curl -XGET http://localhost:3030/$/status
> >
> > and for both the compact and backup command, I received:
> >
> >   "success": false (as seen in the example below)
> >
> > [ {
> >  "finished" : "2014-05-28T13:54:13.608+01:00" ,
> >  "started" : "2014-05-28T13:54:03.607+01:00" ,
>
> 2014?
>
> >  "task" : "backup" ,
> >  "taskId" : "1" ,
> >  "success" : false
> >}
> > ]
> >
> >
> > As I couldn't find any other message to help me diagnose the issue, I
> > stopped the running fuseki instance and tried to use the tdb2.tdbackup
> > command.
> >
> > For this I used apache-jena-4.9.0 and I ran the following command
> >
> > $ tdb2.tdbbackup --loc build
> >
> > This command ran for a while, and I could see that it was writing to the
> > disk, but then it suddenly failed and gave me the following error
> message.
> >
> ...
> > *Caused by: org.apache.thrift.protocol.TProtocolException: Unrecognized
> > type 0*
> ...
> >
> >
> > (I am assuming that this error is the same reason the "compact" command
> > wasn't working.)
>
> The problem would have happen on the failed compact, it just manifests
> itself later on read.
>
> (there is another way to cause the same problem - if some other process
> touches database files)
>
> > I'm not really sure what's gone wrong. I've done the fuseki compact
> command
> > several times without a problem.
> >
> > Likewise, the Fuseki http server continues to be running well. It is
> > responding to all SPARQL GET requests as usual.
> >
> > But as the database is growing (currently at 70G), and I need to be able
> to
> > both back it up and compact it as it grows.
> >
> > I would be most grateful for assistance or help diagnosing the issue.
> > Please let me know if I can provide more information.
>
> It's not looking good for the database if /$/backup is falling. That's a
> very simple use of the database.
>
> You may be able to extract data using SPARQL.
>
> Some data will be in the backup file (the tail of the file may be
> managled but it's compressed n-quads so easy to text edit).
>
>  Andy
>
> >
> > Sincerely,
> >
> > Jeff
> >
>


-- 
Dr. Jeffrey C. Witt
Philosophy Department
Loyola University Maryland
4501 N. Charles St.
Baltimore, MD 21210
www.jeffreycwitt.com


Re: Dataset management API

2023-07-13 Thread Andy Seaborne




On 13/07/2023 21:09, Martynas Jusevičius wrote:

Andy,

Where are the dataset definitions created through the API persisted?


run/configuration


Are they merged with the datasets defined in the config file, or how
does it work?


--config and run/configuration contribute services. Avoid name clashes.

Andy



Martynas

On Sun, 2 Jul 2023 at 19.03, Andy Seaborne  wrote:




On 02/07/2023 13:23, Martynas Jusevičius wrote:

Hi,

Can I see an example of the data that needs to be POSTed to /$/datasets

in

order to create a new dataset+service?

The API is documented here but the data example is missing:


https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#adding-a-dataset-and-its-services


I hope it’s the same data that is used in the config file?


the service part - or parameters dbType and dbname

ActionDatasets.java




https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html#defining-the-service-name-and-endpoints-available


Are there any practical limits on the number of datasets/services?


No.

Each database takes up some memory which is more than the management
information of a configuration.

  Andy



Thanks.

Martynas







Re: Dataset management API

2023-07-13 Thread Martynas Jusevičius
I realised the dataset management API is only available in
fuseki-webapp and not fuseki-main. That's unfortunate.

On Thu, Jul 13, 2023 at 10:09 PM Martynas Jusevičius
 wrote:
>
> Andy,
>
> Where are the dataset definitions created through the API persisted?
> Are they merged with the datasets defined in the config file, or how does it 
> work?
>
> Martynas
>
> On Sun, 2 Jul 2023 at 19.03, Andy Seaborne  wrote:
>>
>>
>>
>> On 02/07/2023 13:23, Martynas Jusevičius wrote:
>> > Hi,
>> >
>> > Can I see an example of the data that needs to be POSTed to /$/datasets in
>> > order to create a new dataset+service?
>> >
>> > The API is documented here but the data example is missing:
>> > https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#adding-a-dataset-and-its-services
>> >
>> > I hope it’s the same data that is used in the config file?
>>
>> the service part - or parameters dbType and dbname
>>
>> ActionDatasets.java
>>
>> > https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html#defining-the-service-name-and-endpoints-available
>> >
>> > Are there any practical limits on the number of datasets/services?
>>
>> No.
>>
>> Each database takes up some memory which is more than the management
>> information of a configuration.
>>
>>  Andy
>>
>> >
>> > Thanks.
>> >
>> > Martynas
>> >


Re: OOM Killed

2023-07-13 Thread Andy Seaborne
> size_t MaxHeapSize  = 4175429632 
  {product} {ergonomic}
> size_t ShenandoahSoftMaxHeapSize= 0 
   {manageable} {default}


Have you tried different garbage collectors?

Shenandoah does its work in parallel so that may be heap in use not by 
the queries.


Andy

On 11/07/2023 10:39, Laura Morales wrote:

I have tried to do some testing but I cannot get a definitive answer about what 
works and what not, because there are so many variables. Also it doesn't fail 
right away, but after 1h to 1.5h, so I've done fewer than a dozen tests 
actually. I'm sorry that I can't be more precise.

The two PCs that I've got are an i3 4th-gen 2C4T with 8GB RAM, and an i7 
4th-gen 2C4T with 16GB RAM. The database is stored in a 1TB USB3.0 SSD (which I 
move between the two PCs). Either way, the only component that seems to make a 
difference is the amount of RAM. They are physical machines (bare metal) and 
not containers.
I'm running "fuseki-server" binary, downloaded from the binaries distribution, 
"./fuseki-server --loc=database --port=7000 --localhost /query".
I query Fuseki (4.8) over HTTP from a script that runs on the same PC. The script uses <100MB of RAM. All queries are 
"read" (no writes) and actually very basic. Either "DESCRIBE " or "SELECT" that selects a 
node and follows a couple of links 3 or 4 levels deep at most. Fuseki answers them very quickly, <5ms, but occasionally it takes 
50-100ms or rarely a couple of seconds (probably because of garbage collection?). The queries are always the same, run over and 
over across the dataset. The dataset contains approximately 200K nodes, 2.5M triples, 4GB disk size. Most nodes have, among their 
properties, 3 that contain long strings, approximately 20KB-50KB combined, per node. I don't query those properties directly, but 
when I "DESCRIBE" a node they are retrieved. Fuseki is very fast, but these strings may contribute to the high memory 
load (I'm only guessing). I cannot identify any particular query as the offender, since it's the same bunch of queries run over and 
over at max speed (ie. one after the other with no wait time). It's pretty much the same amount of work for every query, and it 
works perfectly fine until it consumes all the RAM and SWAP.

The only configuration that worked for me (ie. it completed the job) is -Xmx4G 
and no parallelism at all (one request after the other in series) on 16GB of 
RAM (it used up all the RAM available). It seems strange to me that it needs so 
much RAM even when all the requests are serialized. Querying in parallel with 
16 or 32 threads doesn't seem to make much of a difference to Fuseki, other 
than the even higher memory consumption (Fuseki answers all queries very 
quickly in milliseconds, until it runs out of memory).
The memory growth is not instantaneous, and is not linear. I can see RAM usage 
fluctuate by a 3-4GB range (for example between 3GB and 7GB). But the trend is 
to use more and more memory. For example before crashing it would fluctuate 
between 10GB and 15GB.
If I increase -Xmx over 4GB, Fuseki is eventually OOM killed by the kernel. 
Below 4GB, Fuseki crashes with a heap error like this (both cases fail well 
after 1h of work):

10:03:21 WARN  QueuedThreadPool :: Job failed
java.lang.OutOfMemoryError: Java heap space
10:03:21 WARN  Fuseki  :: [152378] RC = 500 : Java heap space: failed 
reallocation of scalar replaced objects
java.lang.OutOfMemoryError: Java heap space: failed reallocation of scalar 
replaced objects
10:03:21 INFO  Fuseki  :: [152378] 500 Server Error (48.115 s)
10:03:23 WARN  AbstractConnector :: Accept Failure
java.lang.OutOfMemoryError: Java heap space
10:04:08 WARN  QueuedThreadPool :: Job failed
java.lang.OutOfMemoryError: Java heap space
10:04:08 WARN  QueuedThreadPool :: Job failed
java.lang.OutOfMemoryError: Java heap space
Exception in thread "HttpClient-2-SelectorManager" java.lang.OutOfMemoryError: 
Java heap space

On 8GB RAM it always fails for me, -Xmx4G or more is OOM-killed, whereas less 
ends up with a heap error.
The output of "java -XX:+PrintFlagsFinal -version | grep -i "M..HeapSize"" is

size_t MaxHeapSize  = 4175429632
{product} {ergonomic}
size_t ShenandoahSoftMaxHeapSize= 0 
 {manageable} {default}
openjdk version "11.0.18" 2023-01-17
OpenJDK Runtime Environment (build 11.0.18+10-post-Debian-1deb11u1)
OpenJDK 64-Bit Server VM (build 11.0.18+10-post-Debian-1deb11u1, mixed mode, 
sharing)

I've also tried with OpenJDK 17, same results.
I tried Fuseki 3.17 too but I was getting other JSON-LD errors (probably 
related to an old JSON-LD library) so I didn't test further.

I know that I don't have the latest and greatest hardware, but I think my 
database is very small and I feel like Fuseki should not be using 16GB RAM when 

Re: Dataset management API

2023-07-13 Thread Martynas Jusevičius
Andy,

Where are the dataset definitions created through the API persisted?
Are they merged with the datasets defined in the config file, or how
does it work?

Martynas

On Sun, 2 Jul 2023 at 19.03, Andy Seaborne  wrote:

>
>
> On 02/07/2023 13:23, Martynas Jusevičius wrote:
> > Hi,
> >
> > Can I see an example of the data that needs to be POSTed to /$/datasets
> in
> > order to create a new dataset+service?
> >
> > The API is documented here but the data example is missing:
> >
> https://jena.apache.org/documentation/fuseki2/fuseki-server-protocol.html#adding-a-dataset-and-its-services
> >
> > I hope it’s the same data that is used in the config file?
>
> the service part - or parameters dbType and dbname
>
> ActionDatasets.java
>
> >
> https://jena.apache.org/documentation/fuseki2/fuseki-configuration.html#defining-the-service-name-and-endpoints-available
> >
> > Are there any practical limits on the number of datasets/services?
>
> No.
>
> Each database takes up some memory which is more than the management
> information of a configuration.
>
>  Andy
>
> >
> > Thanks.
> >
> > Martynas
> >
>


Re: error on tdb2.tdbbackup: "Caused by: org.apache.thrift.protocol.TProtocolException: Unrecognized type 0"

2023-07-13 Thread Andy Seaborne

Hi Jeff,

There were fixes to compaction in 4.6.0.

On 12/07/2023 23:53, Jeffrey C. Witt wrote:

Dear List,

I ran into an unusual error today when I tried to backup (and also compact)
my TDB instance.

I first encountered the error when trying to compact and backup up using
fuseki 4.3.2

I ran both:


If you ran them at the same time, you may have trigger the problem that 
was fixed in 4.6.0.




$ curl -XPOST http://localhost:3030/$/backup/ds
$ curl -XPOST http://localhost:3030/$/compact/ds

Both of these commands executed for while, filling up disk space, and then
suddently stopped:

Eventually, I ran:

$ curl -XGET http://localhost:3030/$/status

and for both the compact and backup command, I received:

  "success": false (as seen in the example below)

[ {
 "finished" : "2014-05-28T13:54:13.608+01:00" ,
 "started" : "2014-05-28T13:54:03.607+01:00" ,


2014?


 "task" : "backup" ,
 "taskId" : "1" ,
 "success" : false
   }
]


As I couldn't find any other message to help me diagnose the issue, I
stopped the running fuseki instance and tried to use the tdb2.tdbackup
command.

For this I used apache-jena-4.9.0 and I ran the following command

$ tdb2.tdbbackup --loc build

This command ran for a while, and I could see that it was writing to the
disk, but then it suddenly failed and gave me the following error message.


...

*Caused by: org.apache.thrift.protocol.TProtocolException: Unrecognized
type 0*

...



(I am assuming that this error is the same reason the "compact" command
wasn't working.)


The problem would have happen on the failed compact, it just manifests 
itself later on read.


(there is another way to cause the same problem - if some other process 
touches database files)



I'm not really sure what's gone wrong. I've done the fuseki compact command
several times without a problem.

Likewise, the Fuseki http server continues to be running well. It is
responding to all SPARQL GET requests as usual.

But as the database is growing (currently at 70G), and I need to be able to
both back it up and compact it as it grows.

I would be most grateful for assistance or help diagnosing the issue.
Please let me know if I can provide more information.


It's not looking good for the database if /$/backup is falling. That's a 
very simple use of the database.


You may be able to extract data using SPARQL.

Some data will be in the backup file (the tail of the file may be 
managled but it's compressed n-quads so easy to text edit).


Andy



Sincerely,

Jeff