On 02/05/12 07:28, Ziya Akar wrote:
Hi,

I am trying to retrieve all dbpedia triples to extract information
about dataset by query below :

Select * where  {?s ?p ?o} LIMIT 1000 OFFSET x.

I increment 1000 value of x in each execution. After a while value of
x variable exceeds decimal bounds.

What is the error message exactly? (from which system?)

Error message is : com.hp.hpl.jena.query.QueryParseException:
Encountered "<DECIMAL>. It means that Long is not enough for retrieve
all DBpedia results.

That's not a complete error.  It prints out the offending token.

Please - a complete, minimal example

Complete - all the details (what's the query being parsed?)
Minimal - what the least information needed (smallest offset)

It also good practice to include version information.

A DECIMAL is the token for a number with a DOT in it.


When I tried:

SELECT * { ?s ?p ?o } LIMIT 1000 OFFSET 100000000000


When I tried:

SELECT * { ?s ?p ?o } LIMIT 1000 OFFSET 1000000000000000000000000000

using arq.qparse, I got:

08:49:07 WARN  ParserSPARQL11       :: Unexpected throwable:
java.lang.NumberFormatException: For input string: "1000000000000000000000000000"

but if I try:

qparse 'SELECT * { ?s ?p ?o } LIMIT 1000 OFFSET 10.5'

I get

Encountered " <DECIMAL> "10.5 "" at line 1, column 41.
Was expecting:
    <INTEGER> ...

May I suggest that you have put in a number with a "." in it, maybe over 1000, that is formatted using convention that "." is the thousands separator.

The number is not separated in SPARQL.



A long is 2^63 which is about 10^18 or one million million million (an English trillion, or more usually one billion billion, or "exa"). The LOD cloud is not an exatriple of RDF. I wouldn't like to guarantee that ARQ copes that will above an int as it's not common but even a Java int is 2 billion, and 2 billion triples and at 1K triples per call, is 2 million calls to the LOD cloud copy.


http://en.wikipedia.org/wiki/Metric_prefix


How can i handle this situation?  I want to continue to query.

Thanks.

Ziya

If this error comes from DBpedia, then you'll have to ask them.

Jena ARQ uses a long for the offset and limit - I suppose a BigInteger
might be necessary nowdays -- long ago, long was quite enough!

By the way - why not download the dumps of the database instead?  Much
more efficient.

I am analyzing all datasets on LOD cloud to extract information about
datasets. Dbpedia is only one of them. If i download dump files, i
have to query them to extract information too. But i can delete
analyzing triples from dump files and then it works. But i prefer
querying at first.

It will take you a very long time to pull dbpedia over 1000 triples at a time.


        Andy

Ziya

Reply via email to