JointHero opened a new issue #5249: Where's the messages gone ? / pulsar SQL 
(Presto) can only query data less than 10000 ?
URL: https://github.com/apache/pulsar/issues/5249
 
 
   Architect:
   1、Pulsar cluster : zookeeper (3nodes) + Bookkeeper (3nodes)+ 
Broker&funcation worker (3nodes)+ proxy (1nodes)+ sql worker (3nodes)
   all these nodes running in docker
   2、Nifi:1 node running in docker
   
   Senario:
   I create a persistent topic named "persistent://public/default/tusr1", use a 
nifi workflow to read user data from https://randomuser.me/ and push the 
userdata in to "persistent://public/default/tusr1", then I use pulsar sql to 
get the message count.
   the sql is simply: [  SELECT COUNT(ssn) FROM 
pulsar."public/default"."tusr1"; ]
   I Expected the return value is the message count saved in the bookkeeper ;
   but when the messages count up to 100000+, the sql will return 50000+
   
   I use "pulsar-admin persistent stats-internal 
persistent://public/default/tusr1" to get the internal stats , got below :
   {
     "entriesAddedCounter" : 149000,
     "numberOfEntries" : 11999,
     "totalSize" : 1670626,
     "currentLedgerEntries" : 2999,
     "currentLedgerSize" : 417685,
     "lastLedgerCreatedTimestamp" : "2019-09-23T00:16:38.965Z",
     "waitingCursorsCount" : 0,
     "pendingAddEntriesCount" : 0,
     "lastConfirmedEntry" : "8088:2998",
     "state" : "LedgerOpened",
     "ledgers" : [ {
       "ledgerId" : 5638,
       "entries" : 9000,
       "size" : 1252941,
       "offloaded" : false
     }, {
       "ledgerId" : 8088,
       "entries" : 0,
       "size" : 0,
       "offloaded" : false
     } ],
     "cursors" : { }
   }
   
   then I guess the messages all stored in bookkeeper, but presto only can 
query 2 segment? is that right ?
   
   anyone can help me to explain this ? and how chan I query all of the 
messages?
   
   thank you very much
   
   #### System configuration
   **Pulsar version**:2.4.1
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to