sstablekeys (in the tools directory?) can extract the actual keys from your
sstables. You have to run it on each node and then combine and de-dupe the
final results, but I have used this technique with a query generator to extract
data more efficiently.
Sean Durity
From: Chris Splinter
Sent: Friday, January 17, 2020 1:47 PM
To: adrien ruffie
Cc: user@cassandra.apache.org; Erick Ramirez
Subject: [EXTERNAL] Re: COPY command with where condition
Do you know your partition keys?
One option could be to enumerate that list of partition keys in separate cmds
to make the individual operations less expensive for the cluster.
For example:
Say your partition key column is called id and the ids in your database are
[1,2,3]
You could do
./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT * FROM
probe_sensors WHERE id = 1 AND localisation_id = 208812" -url /home/dump
./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT * FROM
probe_sensors WHERE id = 2 AND localisation_id = 208812" -url /home/dump
./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT * FROM
probe_sensors WHERE id = 3 AND localisation_id = 208812" -url /home/dump
Does that option work for you?
On Fri, Jan 17, 2020 at 12:17 PM adrien ruffie
mailto:adriennolar...@hotmail.fr>> wrote:
I don't really know for the moment in production environment, but for
developpment environment the table contains more than 10.000.000 rows.
But we need just a sub dataset of this table not the entirety ...
De : Chris Splinter
mailto:chris.splinter...@gmail.com>>
Envoyé : vendredi 17 janvier 2020 17:40
À : adrien ruffie mailto:adriennolar...@hotmail.fr>>
Cc : user@cassandra.apache.org<mailto:user@cassandra.apache.org>
mailto:user@cassandra.apache.org>>; Erick Ramirez
mailto:flightc...@gmail.com>>
Objet : Re: COPY command with where condition
What you are seeing there is a standard read timeout, how many rows do you
expect back from that query?
On Fri, Jan 17, 2020 at 9:50 AM adrien ruffie
mailto:adriennolar...@hotmail.fr>> wrote:
Thank you very much,
so I do this request with for example -->
./dsbulk unload --dsbulk.schema.keyspace 'dev_keyspace' -query "SELECT * FROM
probe_sensors WHERE localisation_id = 208812 ALLOW FILTERING" -url /home/dump
But I get the following error
com.datastax.dsbulk.executor.api.exception.BulkExecutionException: Statement
execution failed: SELECT * FROM crt_sensors WHERE site_id = 208812 ALLOW
FILTERING (Cassandra timeout during read query at consistency LOCAL_ONE (1
responses were required but only 0 replica responded))
but I configured my driver with following driver.conf, but nothing work
correctly. Do you know what is the problem ?
datastax-java-driver {
basic {
contact-points = ["data1com:9042","data2.com:9042
[data2.com]<https://urldefense.com/v3/__http:/data2.com:9042__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKH7jCV5U$>"]
request {
timeout = "200"
consistency = "LOCAL_ONE"
}
}
advanced {
auth-provider {
class = PlainTextAuthProvider
username = "superuser"
password = "mypass"
}
}
}
De : Chris Splinter
mailto:chris.splinter...@gmail.com>>
Envoyé : vendredi 17 janvier 2020 16:17
À : user@cassandra.apache.org<mailto:user@cassandra.apache.org>
mailto:user@cassandra.apache.org>>
Cc : Erick Ramirez mailto:flightc...@gmail.com>>
Objet : Re: COPY command with where condition
DSBulk has an option that lets you specify the query ( including a WHERE clause
)
See Example 19 in this blog post for details:
https://www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading
[datastax.com]<https://urldefense.com/v3/__https:/www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKBUuw2Cc$>
On Fri, Jan 17, 2020 at 7:34 AM Jean Tremblay
mailto:jean.tremb...@zen-innovations.com>>
wrote:
Did you think about using a Materialised View to generate what you want to
keep, and then use DSBulk to extract the data?
On 17 Jan 2020, at 14:30 , adrien ruffie
mailto:adriennolar...@hotmail.fr>> wrote:
Sorry I come back to a quick question about the bulk loader ...
https://www.datastax.com/blog/2018/05/introducing-datastax-bulk-loader
[datastax.com]<https://urldefense.com/v3/__https:/www.datastax.com/blog/2018/05/introducing-datastax-bulk-loader__;!!M-nmYVHPHQ!aPA4KExKulLx_PrHwhUQwPy881v1sjBkj35R1lAx2EUxSkRCLwmtNon0SMW0XbLKLr1rFjk$>
I read this : "Operations such as converting strings to lowercase, arithmetic
on input columns, or filtering out row