Github user emlaver commented on a diff in the pull request:

    https://github.com/apache/bahir/pull/57#discussion_r157533814
  
    --- Diff: 
sql-cloudant/src/main/scala/org/apache/bahir/cloudant/internal/ChangesReceiver.scala
 ---
    @@ -39,56 +37,38 @@ class ChangesReceiver(config: CloudantChangesConfig)
       }
     
       private def receive(): Unit = {
    -    // Get total number of docs in database using _all_docs endpoint
    -    val limit = new JsonStoreDataAccess(config)
    -      .getTotalRows(config.getTotalUrl, queryUsed = false)
    -
    -    // Get continuous _changes url
    +    // Get normal _changes url
    --- End diff --
    
    For our internal implementation, we (myself and Mayya) wanted the user to 
have a snapshot of data to load into Spark.  For that to be possible, we 
decided to use `continuous` style feed with a doc limit.  With the new _changes 
implementation from Mike's project, the `normal` feed is stable and works as 
expected.  I've also lowered the amount of requests/load time by removing the 
HTTP request for the doc limit since it's not needed with `normal` style 
_changes feed.
    To work with data in "real-time", you can use `CloudantReciever` which 
creates an eternal changes feed within the Spark Streaming context.


---

Reply via email to