Re: Replication to second data center with different number of nodes

2015-03-28 Thread Colin Clark
I typically use a # a lot lower than 256, usually less than 20 for num_tokens 
as a larger number has historically had a dramatic impact on query performance.
—
Colin Clark
co...@clark.ws
+1 612-859-6129
skype colin.p.clark

 On Mar 28, 2015, at 3:46 PM, Eric Stevens migh...@gmail.com wrote:
 
 If you're curious about how Cassandra knows how to replicate data in the 
 remote DC, it's the same as in the local DC, replication is independent in 
 each, and you can even set a different replication strategy per keyspace per 
 datacenter.  Nodes in each DC take up num_tokens positions on a ring, each 
 partition key is mapped to a position on that ring, and whomever owns that 
 part of the ring is the primary for that data.  Then (oversimplified) r-1 
 adjacent nodes become replicas for that same data.
 
 On Fri, Mar 27, 2015 at 6:55 AM, Sibbald, Charles charles.sibb...@bskyb.com 
 mailto:charles.sibb...@bskyb.com wrote:
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens
  
 http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens
 
 So go with a default 256, and leave initial token empty:
 
 num_tokens: 256
 # initial_token:
 
 Cassandra will always give each node the same number of tokens, the only time 
 you might want to distribute this is if your instances are of different 
 sizing/capability which is also a bad scenario.
 
 From: Björn Hachmann bjoern.hachm...@metrigo.de 
 mailto:bjoern.hachm...@metrigo.de
 Reply-To: user@cassandra.apache.org mailto:user@cassandra.apache.org 
 user@cassandra.apache.org mailto:user@cassandra.apache.org
 Date: Friday, 27 March 2015 12:11
 To: user user@cassandra.apache.org mailto:user@cassandra.apache.org
 Subject: Re: Replication to second data center with different number of nodes
 
 
 2015-03-27 11:58 GMT+01:00 Sibbald, Charles charles.sibb...@bskyb.com 
 mailto:charles.sibb...@bskyb.com:
 Cassandra’s Vnodes config
 
 ​Thank you. Yes, we are using vnodes! The num_token parameter controls the 
 number of vnodes assigned to a specific node.​
 
 Might be I am seeing problems where are none. 
 
 Let me rephrase my question: How does Cassandra know it has to replicate 1/3 
 of all keys to each single node in the second DC? I can see two ways:
  1. It has to be configured explicitly.
  2. It is derived from the number of nodes available in the data center at 
 the time `nodetool rebuild` is started.
 
 Kind regards
 Björn
 Information in this email including any attachments may be privileged, 
 confidential and is intended exclusively for the addressee. The views 
 expressed may not be official policy, but the personal views of the 
 originator. If you have received it in error, please notify the sender by 
 return e-mail and delete it from your system. You should not reproduce, 
 distribute, store, retransmit, use or disclose its contents to anyone. Please 
 note we reserve the right to monitor all e-mail communication through our 
 internal and external networks. SKY and the SKY marks are trademarks of Sky 
 plc and Sky International AG and are used under licence. Sky UK Limited 
 (Registration No. 2906991), Sky-In-Home Service Limited (Registration No. 
 2067075) and Sky Subscribers Services Limited (Registration No. 2340150) are 
 direct or indirect subsidiaries of Sky plc (Registration No. 2247735). All of 
 the companies mentioned in this paragraph are incorporated in England and 
 Wales and share the same registered office at Grant Way, Isleworth, Middlesex 
 TW7 5QD.
 



smime.p7s
Description: S/MIME cryptographic signature


Re: ('Unable to complete the operation against any hosts', {})

2015-03-28 Thread Dhanasekaran
Rahul,
Can you try reducing the batch size to 1000?

Also what is the write consistency level?

Thanks,
Dhanasekaran

 On 29-Mar-2015, at 12:30 am, Rahul Bhardwaj rahul.bhard...@indiamart.com 
 wrote:
 
 Hi All,
 
 awaiting any response.. please help
 
 
 regards:
 rahul
 
 On Fri, Mar 27, 2015 at 5:54 PM, Rahul Bhardwaj 
 rahul.bhard...@indiamart.com wrote:
 Hi All,
 
 
 
 We are using  cassandra version 2.1.2 with cqlsh 5.0.1 (cluster of three 
 nodes with rf 2)
 
 I need to load around 40 million records into a table of cassandra db. I 
 have created batch of 1 million ( batch of 1 records also gives the same 
 error) in csv format. when I use copy command to import I got this error, 
 which is causing problem. 
 
 cqlsh:mesh_glusr copy 
 glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area)
  from 'gl_a' with delimiter = '\t' and QUOTE = '';  

 Processed 36000 rows; Write: 1769.07 rows/s
 Record has the wrong number of fields (9 instead of 10).
 Aborting import at record #36769. Previously-inserted values still present.
 36669 rows imported in 20.571 seconds.
 cqlsh:mesh_glusr copy 
 glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area)
  from 'gl_a' with delimiter = '\t' and QUOTE = '';
 Processed 185000 rows; Write: 1800.91 rows/s
 Record has the wrong number of fields (9 instead of 10).
 Aborting import at record #185607. Previously-inserted values still present.
 185507 rows imported in 1 minute and 43.428 seconds.
 
 [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native protocol v3]
 Use HELP for help.
 cqlsh use mesh_glusr ;
 cqlsh:mesh_glusr copy 
 glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area)
  from 'gl_a1' with delimiter = '\t' and QUOTE = '';
 Processed 373000 rows; Write: 1741.23 rows/s
 ('Unable to complete the operation against any hosts', {})
 Aborting import at record #373269. Previously-inserted values still present.
 
 
 When we remove already inserted records from file and on again starting the 
 command for rest data, it inserts few more records and gives the same error 
 without any specific.
 
 please help if any one have some idea about this error.
 
 
 
 Regards:
 Rahul Bhardwaj
 
 
 
 Follow IndiaMART.com for latest updates on this and more:Mobile Channel:  
  
 
 Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki 
 Kaam Yahin Banta Hai!!!


Re: ('Unable to complete the operation against any hosts', {})

2015-03-28 Thread Rahul Bhardwaj
Hi All,

awaiting any response.. please help


regards:
rahul

On Fri, Mar 27, 2015 at 5:54 PM, Rahul Bhardwaj 
rahul.bhard...@indiamart.com wrote:

 Hi All,



 We are using  cassandra version 2.1.2 with cqlsh 5.0.1 (cluster of three
 nodes with rf 2)

 I need to load around 40 million records into a table of cassandra db. I
 have created batch of 1 million ( batch of 1 records also gives the
 same error) in csv format. when I use copy command to import I got this
 error, which is causing problem.

 cqlsh:mesh_glusr copy
 glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area)
 from 'gl_a' with delimiter = '\t' and QUOTE = '';

 Processed 36000 rows; Write: 1769.07 rows/s
 Record has the wrong number of fields (9 instead of 10).
 Aborting import at record #36769. Previously-inserted values still present.
 36669 rows imported in 20.571 seconds.
 cqlsh:mesh_glusr copy
 glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area)
 from 'gl_a' with delimiter = '\t' and QUOTE = '';
 Processed 185000 rows; Write: 1800.91 rows/s
 Record has the wrong number of fields (9 instead of 10).
 Aborting import at record #185607. Previously-inserted values still
 present.
 185507 rows imported in 1 minute and 43.428 seconds.

 [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native protocol v3]
 Use HELP for help.
 cqlsh use mesh_glusr ;
 cqlsh:mesh_glusr copy
 glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area)
 from 'gl_a1' with delimiter = '\t' and QUOTE = '';
 Processed 373000 rows; Write: 1741.23 rows/s
 ('Unable to complete the operation against any hosts', {})
 Aborting import at record #373269. Previously-inserted values still
 present.


 When we remove already inserted records from file and on again starting
 the command for rest data, it inserts few more records and gives the same
 error without any specific.

 please help if any one have some idea about this error.



 Regards:
 Rahul Bhardwaj




-- 

Follow IndiaMART.com http://www.indiamart.com for latest updates on this 
and more: https://plus.google.com/+indiamart 
https://www.facebook.com/IndiaMART https://twitter.com/IndiaMART Mobile 
Channel: 
https://itunes.apple.com/WebObjects/MZStore.woa/wa/viewSoftware?id=668561641mt=8
 
https://play.google.com/store/apps/details?id=com.indiamart.m 
http://m.indiamart.com/
https://www.youtube.com/watch?v=DzORNbeSXN8list=PL2o4J51MqpL0mbue6kzDa6eymLVUXtlR1index=2
Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki 
Kaam 
Yahin Banta Hai 
https://www.youtube.com/watch?v=Q9fZ5ILY3w8feature=youtu.be!!!


Re: High latencies for simple queries

2015-03-28 Thread Ben Bromhead
cqlsh runs on the internal cassandra python drivers: cassandra-pylib and
cqlshlib.

I would not recommend using them at all (nothing wrong with them, they are
just not built with external users in mind).

I have never used python-driver in anger so I can't comment on whether it
is genuinely slower than the internal C* python driver, but this might be a
question for python-driver folk.

On 28 March 2015 at 00:34, Artur Siekielski a...@vhex.net wrote:

 On 03/28/2015 12:13 AM, Ben Bromhead wrote:

 One other thing to keep in mind / check is that doing these tests
 locally the cassandra driver will connect using the network stack,
 whereas postgres supports local connections over a unix domain socket
 (this is also enabled by default).

 Unix domain sockets are significantly faster than tcp as you don't have
 a network stack to traverse. I think any driver using libpq will attempt
 to use the domain socket when connecting locally.


 Good catch. I assured that psycopg2 connects through a TCP socket and the
 numbers increased by about 20%, but it still is an order of magnitude
 faster than Cassandra.


 But I'm going to hazard a guess something else is going on with the
 Cassandra connection as I'm able to get 0.5ms queries locally and that's
 even with trace turned on.


 Using python-driver?




-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
http://twitter.com/instaclustr | (650) 284 9692


Re: ('Unable to complete the operation against any hosts', {})

2015-03-28 Thread Jonathan Haddad
Don't use batches for this. Use a lot of async queries. 

https://lostechies.com/ryansvihla/2014/08/28/cassandra-batch-loading-without-the-batch-keyword/

Jon

 On Mar 27, 2015, at 5:24 AM, Rahul Bhardwaj rahul.bhard...@indiamart.com 
 wrote:
 
 Hi All,
 
 
 
 We are using  cassandra version 2.1.2 with cqlsh 5.0.1 (cluster of three 
 nodes with rf 2)
 
 I need to load around 40 million records into a table of cassandra db. I have 
 created batch of 1 million ( batch of 1 records also gives the same 
 error) in csv format. when I use copy command to import I got this error, 
 which is causing problem. 
 
 cqlsh:mesh_glusr copy 
 glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area)
  from 'gl_a' with delimiter = '\t' and QUOTE = '';   
   
 Processed 36000 rows; Write: 1769.07 rows/s
 Record has the wrong number of fields (9 instead of 10).
 Aborting import at record #36769. Previously-inserted values still present.
 36669 rows imported in 20.571 seconds.
 cqlsh:mesh_glusr copy 
 glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area)
  from 'gl_a' with delimiter = '\t' and QUOTE = '';
 Processed 185000 rows; Write: 1800.91 rows/s
 Record has the wrong number of fields (9 instead of 10).
 Aborting import at record #185607. Previously-inserted values still present.
 185507 rows imported in 1 minute and 43.428 seconds.
 
 [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native protocol v3]
 Use HELP for help.
 cqlsh use mesh_glusr ;
 cqlsh:mesh_glusr copy 
 glusr_usr1(glusr_usr_id,glusr_usr_usrname,glusr_usr_pass,glusr_usr_membersince,glusr_usr_designation,glusr_usr_url,glusr_usr_modid,fk_gl_city_id,fk_gl_state_id,glusr_usr_ph2_area)
  from 'gl_a1' with delimiter = '\t' and QUOTE = '';
 Processed 373000 rows; Write: 1741.23 rows/s
 ('Unable to complete the operation against any hosts', {})
 Aborting import at record #373269. Previously-inserted values still present.
 
 
 When we remove already inserted records from file and on again starting the 
 command for rest data, it inserts few more records and gives the same error 
 without any specific.
 
 please help if any one have some idea about this error.
 
 
 
 Regards:
 Rahul Bhardwaj
 
 
 
 
 Follow IndiaMART.com for latest updates on this and more:Mobile Channel:  
  
 
 Watch how IndiaMART Maximiser helped Mr. Khanna expand his business. kyunki 
 Kaam Yahin Banta Hai!!!


Re: Replication to second data center with different number of nodes

2015-03-28 Thread Eric Stevens
If you're curious about how Cassandra knows how to replicate data in the
remote DC, it's the same as in the local DC, replication is independent in
each, and you can even set a different replication strategy per keyspace
per datacenter.  Nodes in each DC take up num_tokens positions on a ring,
each partition key is mapped to a position on that ring, and whomever owns
that part of the ring is the primary for that data.  Then (oversimplified)
r-1 adjacent nodes become replicas for that same data.

On Fri, Mar 27, 2015 at 6:55 AM, Sibbald, Charles charles.sibb...@bskyb.com
 wrote:


 http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens

  So go with a default 256, and leave initial token empty:

  num_tokens: 256

 # initial_token:


  Cassandra will always give each node the same number of tokens, the only
 time you might want to distribute this is if your instances are of
 different sizing/capability which is also a bad scenario.

   From: Björn Hachmann bjoern.hachm...@metrigo.de
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Friday, 27 March 2015 12:11
 To: user user@cassandra.apache.org
 Subject: Re: Replication to second data center with different number of
 nodes


 2015-03-27 11:58 GMT+01:00 Sibbald, Charles charles.sibb...@bskyb.com:

 Cassandra’s Vnodes config


 ​Thank you. Yes, we are using vnodes! The num_token parameter controls the
 number of vnodes assigned to a specific node.​

  Might be I am seeing problems where are none.

  Let me rephrase my question: How does Cassandra know it has to replicate
 1/3 of all keys to each single node in the second DC? I can see two ways:
  1. It has to be configured explicitly.
  2. It is derived from the number of nodes available in the data center at
 the time `nodetool rebuild` is started.

  Kind regards
 Björn
   Information in this email including any attachments may be privileged,
 confidential and is intended exclusively for the addressee. The views
 expressed may not be official policy, but the personal views of the
 originator. If you have received it in error, please notify the sender by
 return e-mail and delete it from your system. You should not reproduce,
 distribute, store, retransmit, use or disclose its contents to anyone.
 Please note we reserve the right to monitor all e-mail communication
 through our internal and external networks. SKY and the SKY marks are
 trademarks of Sky plc and Sky International AG and are used under licence.
 Sky UK Limited (Registration No. 2906991), Sky-In-Home Service Limited
 (Registration No. 2067075) and Sky Subscribers Services Limited
 (Registration No. 2340150) are direct or indirect subsidiaries of Sky plc
 (Registration No. 2247735). All of the companies mentioned in this
 paragraph are incorporated in England and Wales and share the same
 registered office at Grant Way, Isleworth, Middlesex TW7 5QD.



Re: Arbitrary nested tree hierarchy data model

2015-03-28 Thread Robert Wille
Ben Bromhead sent an email to me directly and expressed an interest in seeing 
some of my queries. I may as well post them for everyone. Here are my queries 
for the part of my code that reads and cleans up browse trees.

@NamedCqlQueries({
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_CHECK_TREE_EXISTS,
query = SELECT tree FROM tree WHERE tree = :tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_QUORUM
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_ALL_TREES,
query = SELECT tree, atime, pub, rhpath FROM tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_QUORUM
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_ALL_DOC_BROWSE_TREE,
query = SELECT tree FROM tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_QUORUM
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_ALL_DOC_BROWSE_NODE,
query = SELECT hpath, tree FROM node,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_ALL_DOC_BROWSE_INDEX_PAGE,
query = SELECT page, tree FROM path_by_page,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_ALL_DOC_BROWSE_INDEX_PUB,
query = SELECT distinct tree, bucket FROM path_by_pub,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_ALL_DOC_BROWSE_INDEX_CHILD,
query = SELECT distinct phpath, bucket, tree FROM path_by_parent,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_CLEAN_DOC_BROWSE_TREE,
query = DELETE FROM tree WHERE tree IN :tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_CLEAN_DOC_BROWSE_NODE,
query = DELETE FROM node WHERE hpath IN :hpath AND tree = :tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_CLEAN_DOC_BROWSE_INDEX_PAGE,
query = DELETE FROM path_by_page WHERE page IN :page AND tree = :tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_CLEAN_DOC_BROWSE_INDEX_PUB,
query = DELETE FROM path_by_pub WHERE tree = :tree AND bucket IN :bucket,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_CLEAN_DOC_BROWSE_INDEX_CHILD,
query = DELETE FROM path_by_parent WHERE phpath = :phpath AND bucket = :bucket 
AND tree IN :tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_MAX_ORDINAL,
query = SELECT pord FROM path_by_pub WHERE tree = :tree AND bucket = :bucket 
ORDER BY pord DESC,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_QUORUM
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_PAGE,
query = SELECT page, tree, ord, hpath FROM path_by_page WHERE page = :page AND 
tree = :tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_PAGE_ALL_TREES,
query = SELECT page, tree, ord, hpath FROM path_by_page WHERE page = :page,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_NODE,
query = SELECT tree, hpath, node, ccount FROM node WHERE hpath = :hpath AND 
tree = :tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_NODE_ALL_TREES,
query = SELECT tree, hpath, node, ccount FROM node WHERE hpath = :hpath,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_TREE_FOR_HASHPATH,
query = SELECT tree, node FROM node WHERE hpath = :hpath,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_CHILDREN,
query = SELECT hpath FROM path_by_parent WHERE phpath = :phpath AND bucket = 
:bucket AND tree = :tree AND ord = :ord,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_ALL_CHILDREN,
query = SELECT hpath FROM path_by_parent WHERE phpath = :phpath AND bucket = 
:bucket AND tree = :tree,
keyspace = KeyspaceFamilyImpl.BROWSE,
consistencyLevel = ConsistencyLevel.LOCAL_ONE
),
@NamedCqlQuery(
name = DocumentBrowseDaoImpl.Q_GET_NEIGHBORS_NEXT,
query = SELECT hpath FROM path_by_pub WHERE tree = :tree AND bucket = :bucket 
AND pord  :pord ORDER BY pord,
keyspace = KeyspaceFamilyImpl.BROWSE,

Re: Arbitrary nested tree hierarchy data model

2015-03-28 Thread daemeon reiydelle
Fascinating. Both the mysql front and and this delightful inverted search
solution. Your creativity makes me wonder what other delights your query
solutions might expose!!

sent from my mobile
Daemeon C.M. Reiydelle
USA 415.501.0198
London +44.0.20.8144.9872
On Mar 27, 2015 7:08 PM, Robert Wille rwi...@fold3.com wrote:

  Okay, this is going to be a pretty long post, but I think its an
 interesting data model, and hopefully someone will find it worth going
 through.

  First, I think it will be easier to understand the modeling choices I
 made if you see the end product. Go to
 http://www.fold3.com/browse.php#249|hzUkLqDmI. What you see looks like
 one big tree, but actually is a combination of trees spliced together.
 There is one tree in a relational database that forms what I call the
 top-level browse. The top-level browse is used to navigate through
 categories until you arrive at a publication. When you drill down into a
 publication, you are then viewing data stored in Cassandra. The link
 provided above points to the root of a publication (in this case, maps from
 the Civil War), so to the left is top-level browse coming from MySQL, and
 to the right is the Cassandra browse. Each publication has an independent
 tree in Cassandra, with all trees stored in the same set of tables (I do
 not dynamically create tables for each publication — I personally think
 that’s a bad practice). We currently have 458 publications, and
 collectively they have about half a billion nodes and consume about 400 GB
 (RF=3).

  My trees are immutable. When there are changes to a publication (e.g.
 adding new documents), it is very difficult to know what changes need to be
 made to the tree to edit it in-place. Also, it would be impossible to
 maintain navigational consistency while a tree is in process of being
 restructured. So, when a publication changes, I create a completely new
 tree. Once the new tree is built, I change a reference to point to the new
 tree. I have a process that nightly pages through the tables and deletes
 records that belong to obsolete trees. This process takes about five hours.
 If it were much longer than that, I would probably run it weekly. My
 database has quite a bit of churn, which is fairly unusual for a
 Cassandra-based application. Most nights build two or three trees,
 generally resulting in a few tens of millions of new records and a slightly
 fewer number of deletions. Size-tiered compaction is a bad choice for
 churn, so I use leveled compaction. Most publications are at most a few
 million nodes, and generally build in less than 20 minutes.

  Since any modeling exercise requires knowing the queries, I should
 describe that before getting into the model. Here are the features I need
 to support. For browsing the tree, I need to be able to get the children of
 a node (paginated), the siblings of a node (also paginated), and the
 ancestors of a node. The leaves of each tree are images and form a
 filmstrip. You can use the filmstrip to navigate through all the images in
 a publication in the tree’s natural order. If you go to my browse page and
 keep drilling down, you’ll eventually get to an image. The filmstrip
 appears at the bottom of the image viewer.

  Before I discuss the schema, I should discuss a couple of other
 non-obvious things that are relevant to the data model. One very common
 operation is to retrieve a node and all of its ancestors in order to
 display a path. Denormalization would suggest that I store the data for
 each node, along with that of all of its ancestors. That would mean that in
 my biggest tree, I would store the root node 180 million times. I didn’t
 consider that kind of bloat to be acceptable, so I do not denormalize
 ancestors. I also wanted to retrieve a node and its ancestors in constant
 time, rather than O(n) as would be typical for tree traversal. In order to
 accomplish this, I use a pretty unique idea for a node's primary key. I
 create a hash from information in the node, and then append it to the hash
 of its parent. So, the primary key is really a path. When I need to
 retrieve a node and its ancestors, I tokenize the path and issue queries in
 parallel to get all the nodes in the ancestry at the same time. In keeping
 with this pattern of not denormalizing, my auxiliary tables do not have
 node data in them, but instead provide a means of getting hash paths, which
 I then tokenize and make parallel requests with. Most requests that use an
 auxiliary table can generally just make a query to the auxiliary table to
 get the hash path, and then retrieve the node and its ancestors from the
 node table. Three or fewer trips to Cassandra are sufficient for all my
 API’s.

  Without further ado, here’s my schema (with commentary):

  CREATE TABLE tree (
 tree INT,
 pub INT,
 rhpath VARCHAR,
 atime TIMESTAMP,
 ccount INT,
 ncount INT,
 PRIMARY KEY (tree)
 ) WITH gc_grace_seconds = 864000;

  This table maintains the references to the root nodes 

Re: Arbitrary nested tree hierarchy data model

2015-03-28 Thread Peter Lin
that's neat, thanks for sharing.

sounds like the solution is partly inspired by merkle tree to make lookup
fast and easy.

peter

On Fri, Mar 27, 2015 at 10:07 PM, Robert Wille rwi...@fold3.com wrote:

  Okay, this is going to be a pretty long post, but I think its an
 interesting data model, and hopefully someone will find it worth going
 through.

  First, I think it will be easier to understand the modeling choices I
 made if you see the end product. Go to
 http://www.fold3.com/browse.php#249|hzUkLqDmI. What you see looks like
 one big tree, but actually is a combination of trees spliced together.
 There is one tree in a relational database that forms what I call the
 top-level browse. The top-level browse is used to navigate through
 categories until you arrive at a publication. When you drill down into a
 publication, you are then viewing data stored in Cassandra. The link
 provided above points to the root of a publication (in this case, maps from
 the Civil War), so to the left is top-level browse coming from MySQL, and
 to the right is the Cassandra browse. Each publication has an independent
 tree in Cassandra, with all trees stored in the same set of tables (I do
 not dynamically create tables for each publication — I personally think
 that’s a bad practice). We currently have 458 publications, and
 collectively they have about half a billion nodes and consume about 400 GB
 (RF=3).

  My trees are immutable. When there are changes to a publication (e.g.
 adding new documents), it is very difficult to know what changes need to be
 made to the tree to edit it in-place. Also, it would be impossible to
 maintain navigational consistency while a tree is in process of being
 restructured. So, when a publication changes, I create a completely new
 tree. Once the new tree is built, I change a reference to point to the new
 tree. I have a process that nightly pages through the tables and deletes
 records that belong to obsolete trees. This process takes about five hours.
 If it were much longer than that, I would probably run it weekly. My
 database has quite a bit of churn, which is fairly unusual for a
 Cassandra-based application. Most nights build two or three trees,
 generally resulting in a few tens of millions of new records and a slightly
 fewer number of deletions. Size-tiered compaction is a bad choice for
 churn, so I use leveled compaction. Most publications are at most a few
 million nodes, and generally build in less than 20 minutes.

  Since any modeling exercise requires knowing the queries, I should
 describe that before getting into the model. Here are the features I need
 to support. For browsing the tree, I need to be able to get the children of
 a node (paginated), the siblings of a node (also paginated), and the
 ancestors of a node. The leaves of each tree are images and form a
 filmstrip. You can use the filmstrip to navigate through all the images in
 a publication in the tree’s natural order. If you go to my browse page and
 keep drilling down, you’ll eventually get to an image. The filmstrip
 appears at the bottom of the image viewer.

  Before I discuss the schema, I should discuss a couple of other
 non-obvious things that are relevant to the data model. One very common
 operation is to retrieve a node and all of its ancestors in order to
 display a path. Denormalization would suggest that I store the data for
 each node, along with that of all of its ancestors. That would mean that in
 my biggest tree, I would store the root node 180 million times. I didn’t
 consider that kind of bloat to be acceptable, so I do not denormalize
 ancestors. I also wanted to retrieve a node and its ancestors in constant
 time, rather than O(n) as would be typical for tree traversal. In order to
 accomplish this, I use a pretty unique idea for a node's primary key. I
 create a hash from information in the node, and then append it to the hash
 of its parent. So, the primary key is really a path. When I need to
 retrieve a node and its ancestors, I tokenize the path and issue queries in
 parallel to get all the nodes in the ancestry at the same time. In keeping
 with this pattern of not denormalizing, my auxiliary tables do not have
 node data in them, but instead provide a means of getting hash paths, which
 I then tokenize and make parallel requests with. Most requests that use an
 auxiliary table can generally just make a query to the auxiliary table to
 get the hash path, and then retrieve the node and its ancestors from the
 node table. Three or fewer trips to Cassandra are sufficient for all my
 API’s.

  Without further ado, here’s my schema (with commentary):

  CREATE TABLE tree (
 tree INT,
 pub INT,
 rhpath VARCHAR,
 atime TIMESTAMP,
 ccount INT,
 ncount INT,
 PRIMARY KEY (tree)
 ) WITH gc_grace_seconds = 864000;

  This table maintains the references to the root nodes for each tree. pub
 is the primary key for the publication table in my relational database.
 There is usually just one 

Re: High latencies for simple queries

2015-03-28 Thread Artur Siekielski

On 03/28/2015 12:13 AM, Ben Bromhead wrote:

One other thing to keep in mind / check is that doing these tests
locally the cassandra driver will connect using the network stack,
whereas postgres supports local connections over a unix domain socket
(this is also enabled by default).

Unix domain sockets are significantly faster than tcp as you don't have
a network stack to traverse. I think any driver using libpq will attempt
to use the domain socket when connecting locally.


Good catch. I assured that psycopg2 connects through a TCP socket and 
the numbers increased by about 20%, but it still is an order of 
magnitude faster than Cassandra.




But I'm going to hazard a guess something else is going on with the
Cassandra connection as I'm able to get 0.5ms queries locally and that's
even with trace turned on.


Using python-driver?