[ 
https://issues.apache.org/jira/browse/JENA-230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13245446#comment-13245446
 ] 

Andy Seaborne commented on JENA-230:
------------------------------------

Bill,

Thanks for the detail.  Fix put into TDB.

The fix is to use a ThreadLocal variable to hold the transaction support part 
of the state of a dataset.

This will work for Fuseki -- it's effectively a connection pool.

The TDB documentation states that "Dataset" should be used per thread.  Fuseki 
uses a slightly lower level route in.

The fix applied addresses the problem for Fuseki - whether it's the best fix 
for the whole system is something to think about but any other solution must 
support Fuseki's mode of use so any future change will be purely internal to 
support other usages (if a change is made at all - the current, ne appraoch may 
be the best anyway).
                
> Running queries during graph PUT leads to PUT failing
> -----------------------------------------------------
>
>                 Key: JENA-230
>                 URL: https://issues.apache.org/jira/browse/JENA-230
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Fuseki
>    Affects Versions: Fuseki 0.2.1
>         Environment: Macosx. 
>            Reporter: Bill Roberts
>
> 1. create empty dir for TDB and start fuseki with that directory as the 
> tdb:location
> 2. PUT a small file to the graph protocol endpoint
> curl -v -H "Content-Type: text/turtle" --upload-file small.ttl 
> http://localhost:3030/crashtest/data?graph=http://test1
> 3. Run a count of all triples (this step not necessary to reproduce, but just 
> a baseline check to compare against later stages).
> select (count(*) as ?c) where {?s ?p ?o} 
> Answer in my case is 25 (as expected)
> 4. PUT a big file
> curl -v -H "Content-Type: text/turtle" --upload-file big.ttl 
> http://localhost:3030/crashtest/data?graph=http://test2
> (big enough that it takes at least several seconds to load, so you have time 
> to run some other stuff. My example was about 200,000 triples)
> 5. Before it finishes, run the count query another 2 or more times.
> It comes back with 25 each time. So far so good.
> 6.  After the big file load is finished (check for 201 Created in log), run 
> the count again.
> This is where the problem is evident: the count still shows 25, when it 
> should show 200,000 or so.
> (Probably not significant, but my small test file has a few blank nodes in 
> it.  The big file does not).
> ls -l of the TDB dir shows lots of data still in nodes.dat-jrnl.  The log 
> includes the line:
> "WARN  TDB                  :: Transaction not active: 5"
> (full copy of the log below)
> Going through the same procedure without running the COUNTs mentioned in 
> stage 5, then everything goes smoothly.  
> I'd be interested to hear if anyone else can reproduce this - and of course 
> to hear what you think might be wrong!
> Many thanks
> Bill
> Details:
> OS:  Macosx 10.6.8
> fuseki-server --version
> ------------------------------
> Jena:       VERSION: 2.7.0-incubating
> Jena:       BUILD_DATE: 2011-12-14T14:54:09+0000
> ARQ:        VERSION: 2.9.0-incubating
> ARQ:        BUILD_DATE: 2011-12-14T15:04:27+0000
> TDB:        VERSION: 0.9.0-incubating
> TDB:        BUILD_DATE: 2012-02-29T19:39:52+0000
> Fuseki:     VERSION: 0.2.2-incubating-SNAPSHOT
> Fuseki:     BUILD_DATE: 20120330-0505
> java -version
> ------------------
> java version "1.6.0_29"
> Java(TM) SE Runtime Environment (build 1.6.0_29-b11-402-10M3527)
> Java HotSpot(TM) 64-Bit Server VM (build 20.4-b02-402, mixed mode)
> Config file:
> ---------------
> @prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
> @prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> @prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
> @prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
> @prefix fuseki:  <http://jena.apache.org/fuseki#> .
> [] rdf:type fuseki:Server ;     
>   # Services available.  Only explicitly listed services are configured.
>   #  If there is a service description not linked from this list, it is 
> ignored.
>   fuseki:services (
>     <#service1>
>   ) .
> [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
> tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
> tdb:GraphTDB    rdfs:subClassOf  ja:Model .
> <#service1>  rdf:type fuseki:Service ;
>    fuseki:name              "crashtest" ;       # http://host:port/blah
>    fuseki:serviceQuery      "query" ;    # SPARQL query service
>    fuseki:serviceUpdate     "update" ;   # SPARQL update service
>    fuseki:serviceReadWriteGraphStore "data" ;     # SPARQL Graph store 
> protocol (read and write)
>    fuseki:dataset           <#dataset-blah> ;
>    .
> <#dataset-blah> rdf:type      tdb:DatasetTDB ;
>     tdb:location "/Users/bill/tdb/crashtest" ;
>     # Query timeout on this dataset (1s, 1000 milliseconds)
>     ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "10000" ] ;
>     tdb:unionDefaultGraph true ;
> fuseki log:
> --------------
> 20:23:51 INFO  Config               :: Configuration file: test.ttl
> 20:23:51 INFO  Config               :: Service: 
> <file:///Users/bill/code/fuseki-0.2.2/test.ttl#service1>
> 20:23:51 INFO  Config               ::   name = crashtest
> 20:23:51 INFO  Config               ::   query = /crashtest/query
> 20:23:51 INFO  Config               ::   update = /crashtest/update
> 20:23:51 INFO  Config               ::   graphStore(RW) = /crashtest/data
> 20:23:52 INFO  Server               :: Dataset path = /crashtest
> 20:23:52 INFO  Server               :: Fuseki 0.2.2-incubating-SNAPSHOT 
> 20120330-0505
> 20:23:52 INFO  Server               :: Jetty 7.x.y-SNAPSHOT
> 20:23:52 INFO  Server               :: Started 2012/04/01 20:23:52 BST on 
> port 3030
> 20:24:15 INFO  Fuseki               :: [1] PUT 
> http://localhost:3030/crashtest/data?graph=http://test1
> 20:24:16 INFO  Fuseki               :: [1] 201 Created
> 20:24:43 INFO  Fuseki               :: [2] GET 
> http://localhost:3030/crashtest/query?query=select+%28count%28*%29+as+%3Fc%29+where+%7B%3Fs+%3Fp+%3Fo%7D+&output=text&stylesheet=%2Fxml-to-html.xsl
> 20:24:43 INFO  Fuseki               :: [2] Query = select (count(*) as ?c) 
> where {?s ?p ?o} 
> 20:24:43 INFO  Fuseki               :: [2] OK/select
> 20:24:43 INFO  Fuseki               :: [2] 200 OK
> 20:29:12 INFO  Fuseki               :: [3] PUT 
> http://localhost:3030/crashtest/data?graph=http://test2
> 20:29:14 INFO  Fuseki               :: [4] GET 
> http://localhost:3030/crashtest/query?query=select+%28count%28*%29+as+%3Fc%29+where+%7B%3Fs+%3Fp+%3Fo%7D+&output=text&stylesheet=%2Fxml-to-html.xsl
> 20:29:14 INFO  Fuseki               :: [4] Query = select (count(*) as ?c) 
> where {?s ?p ?o} 
> 20:29:14 INFO  Fuseki               :: [4] OK/select
> 20:29:14 INFO  Fuseki               :: [4] 200 OK
> 20:29:18 INFO  Fuseki               :: [5] GET 
> http://localhost:3030/crashtest/query?query=select+%28count%28*%29+as+%3Fc%29+where+%7B%3Fs+%3Fp+%3Fo%7D+&output=text&stylesheet=%2Fxml-to-html.xsl
> 20:29:18 INFO  Fuseki               :: [5] Query = select (count(*) as ?c) 
> where {?s ?p ?o} 
> 20:29:18 INFO  Fuseki               :: [5] OK/select
> 20:29:18 INFO  Fuseki               :: [5] 200 OK
> 20:29:28 WARN  TDB                  :: Transaction not active: 5
> 20:29:28 INFO  Fuseki               :: [3] 201 Created
> 20:29:28 INFO  Fuseki               :: [6] GET 
> http://localhost:3030/crashtest/query?query=select+%28count%28*%29+as+%3Fc%29+where+%7B%3Fs+%3Fp+%3Fo%7D+&output=text&stylesheet=%2Fxml-to-html.xsl
> 20:29:28 INFO  Fuseki               :: [6] Query = select (count(*) as ?c) 
> where {?s ?p ?o} 
> 20:29:28 INFO  Fuseki               :: [6] OK/select
> 20:29:28 INFO  Fuseki               :: [6] 200 OK
> ls -l crashtest
> ------------------
> drwxr-xr-x  31 bill  bill      1054  1 Apr 20:24 ./
> drwxr-xr-x  15 bill  bill       510  1 Apr 20:23 ../
> -rw-r--r--   1 bill  bill  16777216  1 Apr 20:29 GOSP.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 GOSP.idn
> -rw-r--r--   1 bill  bill  16777216  1 Apr 20:29 GPOS.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 GPOS.idn
> -rw-r--r--   1 bill  bill  16777216  1 Apr 20:29 GSPO.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 GSPO.idn
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 OSP.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 OSP.idn
> -rw-r--r--   1 bill  bill  16777216  1 Apr 20:29 OSPG.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 OSPG.idn
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 POS.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 POS.idn
> -rw-r--r--   1 bill  bill  16777216  1 Apr 20:29 POSG.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 POSG.idn
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 SPO.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 SPO.idn
> -rw-r--r--   1 bill  bill  16777216  1 Apr 20:29 SPOG.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 SPOG.idn
> -rw-r--r--   1 bill  bill         0  1 Apr 20:24 journal.jrnl
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:24 node2id.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:24 node2id.idn
> -rw-r--r--   1 bill  bill      2485  1 Apr 20:24 nodes.dat
> -rw-r--r--   1 bill  bill   5596812  1 Apr 20:29 nodes.dat-jrnl
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 prefix2id.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 prefix2id.idn
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 prefixIdx.dat
> -rw-r--r--   1 bill  bill   8388608  1 Apr 20:23 prefixIdx.idn
> -rw-r--r--   1 bill  bill         0  1 Apr 20:23 prefixes.dat
> -rw-r--r--   1 bill  bill         0  1 Apr 20:24 prefixes.dat-jrnl

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to