Re: How to increase replication rate from a large single node database?

2024-03-15 Thread Jim Mason
 sounds like a good solution for replication.Thanks very much for sharing.
Jim Mason
On Friday, March 15, 2024 at 04:19:13 AM EDT, Hoël Iris 
 wrote:  
 
 Hi!

You might be interested in this open source project that we wrote at my
work for our daily backup : https://github.com/tolteck/couchcopy

It moves shards with a simple rsync then uses [CouchDB shard management](
https://docs.couchdb.org/en/stable/cluster/sharding.html) to make the new
cluster aware of these shards.
If you want no downtime, after running couchcopy you can replicate the diff
accumulated during the run with a usual replication.

There is also this upstream discussion were I asked if it was a good idea
or not to to write a tool like couchcopy:
https://github.com/apache/couchdb/discussions/3383

Everything that couchcopy does could be done manually with curl following
[CouchDB shard management](
https://docs.couchdb.org/en/stable/cluster/sharding.html) documentation. In
your case, with only one database, it could be quicker to do it manually.

Don't hesitate to MP or open an issue on couchcopy if needed.


Le ven. 15 mars 2024 à 08:39, Chris Bayliss
 a écrit :

> Hi all,
>
> I inherited a single-node CouchDB database that backs a medical research
> project. We’ve been using CouchDB for 10+ years so not a concern. Then I
> spotted it uses a single database to store billions, 10^9 if we’re being
> pedantic, of documents (2B at the time just over a TB of data) across the
> default 2 shards. Not ideal but technically not a problem then I spotted
> it’s ingesting ~30M documents a day and was continuously compressing and
> reindexing everything associated with this database.
>
> Skipping over months of trial and error. I’m currently replicating it to a
> 4 node NVMe backed cluster n=3 q=256. Everything is running 3.3.3 (the
> Erlang 24.3 version). I’ve read [1] and [2] and right now it’s replicating
> at 2.25k documents a second +/- 0.5k . This is acceptable, it will catch up
> with the initial node eventually,  but at the rate it’s going it’ll be ~60
> days.
>
> How can speed this process up if at all?
>
> I’d add the code that accesses this database isn’t mine either so
> splitting the database out into logical subsets isn’t an option at this
> time.
>
> Thanks
>
>    Chris
>
> 1 -
> https://blog.cloudant.com/2023/02/08/Replication-efficiency-improvements.html
> 2 - https://github.com/apache/couchdb/issues/4308
>
>
> --
> Christopher Bayliss
> Senior Software Engineer, Melbourne eResearch Group
>
> School of Computing and Information Systems
> Level 5, Melbourne Connect (Building 290)
> University of Melbourne, VIC, 3010, Australia
>
> Email: christopher.bayl...@unimelb.edu.au christopher.bayl...@unimelb.edu.au>
>
>
>
  

Re: Introducing Structured Query Server, SQL Queries for CouchDB

2023-07-12 Thread Jim Mason
 Hi Jan
Thanks for this SQL solution.
What I really need to use CouchDB as a "goto" NoSQL solution is an open-source 
JDBC driver.There isn't an open-source one I know of.The second I have that 
driver, it connects a world of solutions to CouchDb.Without that, CouchDb is of 
limited value for us.
Thanks
Jim
On Thursday, July 6, 2023 at 11:03:07 AM EDT, Jan Lehnardt 
 wrote:  
 
 Dear CouchDB user community,

My company Neighbourhoodie is happy to announce Structured Query Server,
a CouchDB companion application that gives you full-fidelity SQL query
abilities for your CouchDB installations.

See all infos, technical details and benchmarks on our product page:

    https://neighbourhood.ie/products-and-services/structured-query-server


Do let me know off-list, if you have any questions.

Best
Jan
—

  

Re: Introducing Structured Query Server, SQL Queries for CouchDB

2023-07-06 Thread Jim Mason
 agree because there are so many good open-source tools built on JDBC drivers 
and SQL
On Thursday, July 6, 2023 at 11:54:00 AM EDT, Cluxter  
wrote:  
 
 This reminds me of this tweet:
https://twitter.com/JordiCabot/status/1576496626566242305
Quoting: "Every nosql database evolves to include sql or dies"
Looks like we reached that point lol.
Great work, this should be useful to many people and it should help making
CouchDB more popular.



Le jeu. 6 juil. 2023 à 17:09, Jim Mason  a
écrit :

>  ThanksInteresting solution for the right use case.
> Jim
>    On Thursday, July 6, 2023 at 11:03:07 AM EDT, Jan Lehnardt <
> j...@apache.org> wrote:
>
>  Dear CouchDB user community,
>
> My company Neighbourhoodie is happy to announce Structured Query Server,
> a CouchDB companion application that gives you full-fidelity SQL query
> abilities for your CouchDB installations.
>
> See all infos, technical details and benchmarks on our product page:
>
>    https://neighbourhood.ie/products-and-services/structured-query-server
>
>
> Do let me know off-list, if you have any questions.
>
> Best
> Jan
> —
>
>
  

Re: Introducing Structured Query Server, SQL Queries for CouchDB

2023-07-06 Thread Jim Mason
 ThanksInteresting solution for the right use case.
Jim
On Thursday, July 6, 2023 at 11:03:07 AM EDT, Jan Lehnardt 
 wrote:  
 
 Dear CouchDB user community,

My company Neighbourhoodie is happy to announce Structured Query Server,
a CouchDB companion application that gives you full-fidelity SQL query
abilities for your CouchDB installations.

See all infos, technical details and benchmarks on our product page:

    https://neighbourhood.ie/products-and-services/structured-query-server


Do let me know off-list, if you have any questions.

Best
Jan
—

  

Re: PSCouchDB 2.5

2022-06-27 Thread Jim Mason
 Thanks Matteo,
Much appreciated!
Jim Mason
On Monday, June 27, 2022, 04:39:43 AM EDT, Matteo Guadrini 
 wrote:  
 
 Hello everybody,
I wanted to announce the new version of PSCouchDB, a complete and Object 
Oriented cli for CouchDB, written in powershell (running on any operating 
system).
Find the release notes here: 
https://github.com/MatteoGuadrini/PSCouchDB/releases/tag/2.5.0

Other useful links
Installation: 
https://github.com/MatteoGuadrini/PSCouchDB#installation-and-simple-usage
Full docs: https://pscouchdb.readthedocs.io/en/latest/
PowershellGallery: https://www.powershellgallery.com/packages/PSCouchDB/2.5.0
Site: https://matteoguadrini.github.io/PSCouchDB

Thanks to everyone!

Matteo Guadrini
  

Re: New CouchDB CLI tool

2021-04-28 Thread Jim Mason
 Hi Jonathan
Thanks for creating this CouchDB CLI tool.When I jump back to Couch work on my 
Fabric projects I'll definitely try this out.
Jim
On Wednesday, April 28, 2021, 06:16:07 AM EDT, Jonathan Hall 
 wrote:  
 
 I'd love to hear about your experience in this area.  This tool is not 
(yet) optimized to be a good backup solution, but it's something that 
might make good sense in the future.  The biggest known limitation is 
the inability for the filesystem-based replications to store state 
information, so every replication starts "from scratch", rather than 
starting from the last sequence id of the source database.  If/when this 
limitation is resolved, it would be quite simple to do "incremental 
backups" from CouchDB to a filesystem.

Jonathan


On 4/28/21 11:43 AM, Sebastien wrote:
> Looks great, thanks for sharing!
>
> I'll soon evaluate solutions to backup my CouchDB servers; maybe this'll
> help!
>
> kr,
> Sébastien
>
> On Tue, Apr 27, 2021 at 10:25 PM Jonathan Hall  wrote:
>
>> Good day everyone!
>>
>> I'd like to announce the "alpha" release of a new CLI tool for
>> interacting with CouchDB.  The tool is designed to replace `curl` as a
>> tool for interacting with the CouchDB API for administrative and
>> debugging tasks.  Further, it adds offers the ability to replicate
>> between CouchDB servers and local filesystem directories, thus
>> facilitating bootstrapping of CouchDB servers.
>>
>> Read the full announcement here: http://kivik.io/kivik-cli-pre-release
>>
>> Download binaries for common architectures here:
>> https://github.com/go-kivik/xkivik/releases
>>
>> There's still some work to be done, and there are no doubt some rough
>> edges and bugs. I welcome any feeback!
>>
>> Jonathan
>>
>>
>>
  

Re: [ANNOUNCE] Apache CouchDB 3.0.0 released

2020-02-26 Thread Jim Mason
 Great product and really like the 3.0 features.
When will we get JDBC driver support to standardize use of CouchDB as our 
preferred database?
Thanks,
Jim Mason

On Wednesday, February 26, 2020, 12:49:54 PM EST, Jan Lehnardt 
 wrote:  
 
 Dear community,

Apache CouchDB® 3.0.0 has been released and is available for download.

Apache CouchDB® lets you access your data where you need it. The Couch 
Replication Protocol is implemented in a variety of projects and products that 
span every imaginable computing environment from globally distributed 
server-clusters, over mobile phones to web browsers.

Store your data safely, on your own servers, or with any leading cloud 
provider. Your web- and native applications love CouchDB, because it speaks 
JSON natively and supports binary data for all your data storage needs.

The Couch Replication Protocol lets your data flow seamlessly between server 
clusters to mobile phones and web browsers, enabling a compelling offline-first 
user-experience while maintaining high performance and strong reliability. 
CouchDB comes with a developer-friendly query language, and optionally 
MapReduce for simple, efficient, and comprehensive data retrieval.

https://couchdb.apache.org/#download

Pre-built packages for Windows, macOS, Debian/Ubuntu and RHEL/CentOS are 
available. Docker images have been submitted to Docker Hub for review and will 
be available as soon as that  process is done.

CouchDB 3.0.0 is a major release, and was originally published on 2020-02-26.

The community would like to thank all contributors for their part in making 
this release, from the smallest bug report or patch to major contributions in 
code, design, or marketing, we couldn’t have done it without you!

See the official release notes document for an exhaustive list of all changes:

http://docs.couchdb.org/en/stable/whatsnew/3.0.html

Release Notes highlights:

  - Default installations are now secure and locked down.

  - User-defined partitioned databases for faster querying

  - Live Shard Splitting for incremental scale-out

  - Updated to modern JavaScript engine SpiderMonkey 60

  - Official support for ARM and PPC 32bit and 64bit systems

  - Many large and small performance improvements

  - Automatic view index warmer

  - Smarter Compaction Daemon

  - Smarter I/O Queue

  - Much improved installers for Windows

  - macOS binaries are now Notarized for full future Catalina support

  - Extremely simplified setup of Lucene search

See the “Road to CouchDB 3.0” blog post series for many more details: 
http://blog.couchdb.org/2020/02/25/the-road-to-couchdb-3-0/

On behalf of the CouchDB PMC,
Jan Lehnardt
—