Re: Repeated messages about Removing tokens

2020-08-26 Thread Hossein Ghiyasi Mehr
Is it possible to remove/decommission/add nodes at the same time? You can't
add twos nodes at the same time.
Do you decommission a node while you are adding another node?

*---*
*VafaTech  : A Total Solution for Data Gathering &
Analysis*
*---*


On Mon, Aug 24, 2020 at 2:37 AM Paulo Motta 
wrote:

> These messages should go away when the decommission/removenode is
> complete. Are you seeing them repeating for the same nodes after they've
> left or do they eventually stop?
>
> If not this is expected behavior but perhaps a bit too verbose if the
> message is being printed more than once per node removal (in which case you
> should probably open a JIRA to fix it).
>
> Em qua., 19 de ago. de 2020 às 02:30, Jai Bheemsen Rao Dhanwada <
> jaibheem...@gmail.com> escreveu:
>
>> Yes correct i have seen these messages but in one of cluster i see these
>> messages are repeated ~3000 times per day
>>
>> On Tuesday, August 18, 2020, Erick Ramirez 
>> wrote:
>>
>>> Yes, it is normal to see that message when you are decommissioning
>>> nodes. It just means that the token ownership(s) is getting updated. Cheers!
>>>



Re: To find top 10 tables with top 10 sstables per read and top 10 tables with top tombstones per read ?

2020-03-16 Thread Hossein Ghiyasi Mehr
I think it's better to change you solution for your purpose. Tables,
partitions , tombstones, statistics, sstables, query and monitoring in
Cassandra are different! You need noSQL concept before doing anything.
Best regards,
*---*
*VafaTech  : A Total Solution for Data Gathering &
Analysis*
*---*


On Mon, Mar 16, 2020 at 1:02 PM Kiran mk  wrote:

> Hi All,
>
> Is there a way to find top 10 tables with top 10 sstables per read and
> top 10 tables with top tombstones per read in Cassandra?
>
> As In Opscenter everytime we have to select the tables to find whats
> the tombstones per read.  There are chances that we might miss
> considering the tables which has more tombstones per read.
>
> Can you please suggest
>
> --
> Best Regards,
> Kiran.M.K.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: How to find which table partitions having the more reads per sstables ?

2020-03-16 Thread Hossein Ghiyasi Mehr
You can get read count per table (Total and TPS) in JMX.
If you want to find hot partitions, you can use nodetool toppartitions
without paying money!
*---*
*VafaTech  : A Total Solution for Data Gathering &
Analysis*
*---*


On Mon, Mar 16, 2020 at 11:41 AM Kiran mk  wrote:

> Hi All,
>
> I am trying to understand  reads per sstables.  How to find which
> table partitions having the more reads per sstables in Cassandra?
>
>
> --
> Best Regards,
> Kiran.M.K.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-03-05 Thread Hossein Ghiyasi Mehr
There isn't any Rollback in real time systems.
It's better to test upgrade sstables and binary on one node. Then

   - if it was OK, upgrade binary on all nodes then run upgrade sstables
   one server at a time.

OR

   - If it was OK, upgrade servers (binary+sstables) one by one.

*---*
*VafaTech <http://www.vafatech.com> : A Total Solution for Data Gathering &
Analysis*
*---*


On Wed, Mar 4, 2020 at 7:38 AM manish khandelwal <
manishkhandelwa...@gmail.com> wrote:

> Should upgradesstables not be run after every node is upgraded? If we need
> to rollback then  we will not be able to downgrade sstables to older
> version.
>
> Regards
> Manish
>
> On Tue, Mar 3, 2020 at 11:26 PM Hossein Ghiyasi Mehr <
> ghiyasim...@gmail.com> wrote:
>
>> It's more safe to upgrade one node before upgrading another node to avoid
>> down time.
>> After upgrading binary and package, run upgradesstables on candidate node
>> then do it on all cluster nodes one by one.
>> *---*
>> *VafaTech <http://www.vafatech.com> : A Total Solution for Data Gathering
>> & Analysis*
>> *---*
>>
>>
>> On Thu, Feb 13, 2020 at 9:27 PM Sergio  wrote:
>>
>>>
>>>- Verify that nodetool upgradesstables has completed successfully on
>>>all nodes from any previous upgrade
>>>- Turn off repairs and any other streaming operations (add/remove
>>>nodes)
>>>- Nodetool drain on the node that needs to be stopped (seeds first,
>>>preferably)
>>>- Stop an un-upgraded node (seeds first, preferably)
>>>- Install new binaries and configs on the down node
>>>- Restart that node and make sure it comes up clean (it will
>>>function normally in the cluster – even with mixed versions)
>>>- nodetool statusbinary to verify if it is up and running
>>>- Repeat for all nodes
>>>- Once the binary upgrade has been performed in all the nodes: Run
>>>upgradesstables on each node (as many at a time as your load will allow).
>>>Minor upgrades usually don’t require this step (only if the sstable 
>>> format
>>>has changed), but it is good to check.
>>>- NOTE: in most cases applications can keep running and will not
>>>notice much impact – unless the cluster is overloaded and a single node
>>>down causes impact.
>>>
>>>
>>>
>>>I added 2 points to the list to clarify.
>>>
>>>Should we add this in a FAQ in the cassandra doc or in the awesome
>>>cassandra https://cassandra.link/awesome/
>>>
>>>Thanks,
>>>
>>>Sergio
>>>
>>>
>>> Il giorno mer 12 feb 2020 alle ore 10:58 Durity, Sean R <
>>> sean_r_dur...@homedepot.com> ha scritto:
>>>
>>>> Check the readme.txt for any upgrade notes, but the basic procedure is
>>>> to:
>>>>
>>>>- Verify that nodetool upgradesstables has completed successfully
>>>>on all nodes from any previous upgrade
>>>>- Turn off repairs and any other streaming operations (add/remove
>>>>nodes)
>>>>- Stop an un-upgraded node (seeds first, preferably)
>>>>- Install new binaries and configs on the down node
>>>>- Restart that node and make sure it comes up clean (it will
>>>>function normally in the cluster – even with mixed versions)
>>>>- Repeat for all nodes
>>>>- Run upgradesstables on each node (as many at a time as your load
>>>>will allow). Minor upgrades usually don’t require this step (only if the
>>>>sstable format has changed), but it is good to check.
>>>>- NOTE: in most cases applications can keep running and will not
>>>>notice much impact – unless the cluster is overloaded and a single node
>>>>down causes impact.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Sean Durity – Staff Systems Engineer, Cassandra
>>>>
>>>>
>>>>
>>>> *From:* Sergio 
>>>> *Sent:* Wednesday, February 12, 2020 11:36 AM
>>>> *To:* user@cassandra.apache.org
>>>> *Subject:* [EXTERNAL] Cassandra 3.11.X upgrades
>>>>
>>>>
>>>>
>>>> Hi 

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-03-03 Thread Hossein Ghiyasi Mehr
It's more safe to upgrade one node before upgrading another node to avoid
down time.
After upgrading binary and package, run upgradesstables on candidate node
then do it on all cluster nodes one by one.
*---*
*VafaTech  : A Total Solution for Data Gathering &
Analysis*
*---*


On Thu, Feb 13, 2020 at 9:27 PM Sergio  wrote:

>
>- Verify that nodetool upgradesstables has completed successfully on
>all nodes from any previous upgrade
>- Turn off repairs and any other streaming operations (add/remove
>nodes)
>- Nodetool drain on the node that needs to be stopped (seeds first,
>preferably)
>- Stop an un-upgraded node (seeds first, preferably)
>- Install new binaries and configs on the down node
>- Restart that node and make sure it comes up clean (it will function
>normally in the cluster – even with mixed versions)
>- nodetool statusbinary to verify if it is up and running
>- Repeat for all nodes
>- Once the binary upgrade has been performed in all the nodes: Run
>upgradesstables on each node (as many at a time as your load will allow).
>Minor upgrades usually don’t require this step (only if the sstable format
>has changed), but it is good to check.
>- NOTE: in most cases applications can keep running and will not
>notice much impact – unless the cluster is overloaded and a single node
>down causes impact.
>
>
>
>I added 2 points to the list to clarify.
>
>Should we add this in a FAQ in the cassandra doc or in the awesome
>cassandra https://cassandra.link/awesome/
>
>Thanks,
>
>Sergio
>
>
> Il giorno mer 12 feb 2020 alle ore 10:58 Durity, Sean R <
> sean_r_dur...@homedepot.com> ha scritto:
>
>> Check the readme.txt for any upgrade notes, but the basic procedure is to:
>>
>>- Verify that nodetool upgradesstables has completed successfully on
>>all nodes from any previous upgrade
>>- Turn off repairs and any other streaming operations (add/remove
>>nodes)
>>- Stop an un-upgraded node (seeds first, preferably)
>>- Install new binaries and configs on the down node
>>- Restart that node and make sure it comes up clean (it will function
>>normally in the cluster – even with mixed versions)
>>- Repeat for all nodes
>>- Run upgradesstables on each node (as many at a time as your load
>>will allow). Minor upgrades usually don’t require this step (only if the
>>sstable format has changed), but it is good to check.
>>- NOTE: in most cases applications can keep running and will not
>>notice much impact – unless the cluster is overloaded and a single node
>>down causes impact.
>>
>>
>>
>>
>>
>>
>>
>> Sean Durity – Staff Systems Engineer, Cassandra
>>
>>
>>
>> *From:* Sergio 
>> *Sent:* Wednesday, February 12, 2020 11:36 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* [EXTERNAL] Cassandra 3.11.X upgrades
>>
>>
>>
>> Hi guys!
>>
>> How do you usually upgrade your cluster for minor version upgrades?
>>
>> I tried to add a node with 3.11.5 version to a test cluster with 3.11.4
>> nodes.
>>
>> Is there any restriction?
>>
>> Best,
>>
>> Sergio
>>
>> --
>>
>> The information in this Internet Email is confidential and may be legally
>> privileged. It is intended solely for the addressee. Access to this Email
>> by anyone else is unauthorized. If you are not the intended recipient, any
>> disclosure, copying, distribution or any action taken or omitted to be
>> taken in reliance on it, is prohibited and may be unlawful. When addressed
>> to our clients any opinions or advice contained in this Email are subject
>> to the terms and conditions expressed in any applicable governing The Home
>> Depot terms of business or client engagement letter. The Home Depot
>> disclaims all responsibility and liability for the accuracy and content of
>> this attachment and for any damages or losses arising from any
>> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
>> items of a destructive nature, which may be contained in this attachment
>> and shall not be liable for direct, indirect, consequential or special
>> damages in connection with this e-mail message or its attachment.
>>
>


Re: Cassandra is not showing a node up hours after restart

2019-12-08 Thread Hossein Ghiyasi Mehr
Which version of Cassandra did you install? deb or tar?
If it's deb, its script should be used for start/stop.
If it's tar, kill pid of cassandra to stop and use bin/cassandra to start.

Stop doesn't need any other actions: drain, disable gossip & etc.

Where do you use Cassandra?
*---*
*VafaTech  : A Total Solution for Data Gathering &
Analysis*
*---*


On Fri, Dec 6, 2019 at 11:20 PM Paul Mena  wrote:

> As we are still without a functional Cassandra cluster in our development
> environment, I thought I’d try restarting the same node (one of 4 in the
> cluster) with the following command:
>
>
>
> ip=$(cat /etc/hostname); nodetool disablethrift && nodetool disablebinary
> && sleep 5 && nodetool disablegossip && nodetool drain && sleep 10 && sudo
> service cassandra restart && until echo "SELECT * FROM system.peers LIMIT
> 1;" | cqlsh $ip > /dev/null 2>&1; do echo "Node $ip is still DOWN"; sleep
> 10; done && echo "Node $ip is now UP"
>
>
>
> The above command returned “Node is now UP” after about 40 seconds,
> confirmed on “node001” via “nodetool status”:
>
>
>
> user@node001=> nodetool status
>
> Datacenter: datacenter1
>
> ===
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address  Load   Tokens  OwnsHost
> ID   Rack
>
> UN  192.168.187.121  539.43 GB  256 ?
> c99cf581-f4ae-4aa9-ab37-1a114ab2429b  rack1
>
> UN  192.168.187.122  633.92 GB  256 ?
> bfa07f47-7e37-42b4-9c0b-024b3c02e93f  rack1
>
> UN  192.168.187.123  576.31 GB  256 ?
> 273df9f3-e496-4c65-a1f2-325ed288a992  rack1
>
> UN  192.168.187.124  628.5 GB   256 ?
> b8639cf1-5413-4ece-b882-2161bbb8a9c3  rack1
>
>
>
> As was the case before, running “nodetool status” on any of the other
> nodes shows that “node001” is still down:
>
>
>
> user@node002=> nodetool status
>
> Datacenter: datacenter1
>
> ===
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address  Load   Tokens  OwnsHost
> ID   Rack
>
> DN  192.168.187.121  538.94 GB  256 ?
> c99cf581-f4ae-4aa9-ab37-1a114ab2429b  rack1
>
> UN  192.168.187.122  634.04 GB  256 ?
> bfa07f47-7e37-42b4-9c0b-024b3c02e93f  rack1
>
> UN  192.168.187.123  576.42 GB  256 ?
> 273df9f3-e496-4c65-a1f2-325ed288a992  rack1
>
> UN  192.168.187.124  628.56 GB  256 ?
> b8639cf1-5413-4ece-b882-2161bbb8a9c3  rack1
>
>
>
> Is it inadvisable to continue with the rolling restart?
>
>
>
> *Paul Mena*
>
> Senior Application Administrator
>
> WHOI - Information Services
>
> 508-289-3539
>
>
>
> *From:* Shalom Sagges 
> *Sent:* Tuesday, November 26, 2019 12:59 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra is not showing a node up hours after restart
>
>
>
> Hi Paul,
>
>
>
> From the gossipinfo output, it looks like the node's IP address and
> rpc_address are different.
>
> /192.168.*187*.121 vs RPC_ADDRESS:192.168.*185*.121
>
> You can also see that there's a schema disagreement between nodes, e.g.
> schema_id on node001 is fd2dcb4b-ca62-30df-b8f2-d3fd774f2801 and on node002
> it is fd2dcb4b-ca62-30df-b8f2-d3fd774f2801.
>
> You can run nodetool describecluster to see it as well.
>
> So I suggest to change the rpc_address to the ip_address of the node or
> set it to 0.0.0.0 and it should resolve the issue.
>
>
>
> Hope this helps!
>
>
>
>
>
> On Tue, Nov 26, 2019 at 4:05 AM Inquistive allen 
> wrote:
>
> Hello ,
>
>
>
> Check and compare everything parameters
>
>
>
> 1. Java version should ideally match across all nodes in the cluster
>
> 2. Check if port 7000 is open between the nodes. Use telnet or nc commands
>
> 3. You must see some clues in system logs, why the gossip is failing.
>
>
>
> Do confirm on the above things.
>
>
>
> Thanks
>
>
>
>
>
> On Tue, 26 Nov, 2019, 2:50 AM Paul Mena,  wrote:
>
> NTP was restarted on the Cassandra nodes, but unfortunately I’m still
> getting the same result: the restarted node does not appear to be rejoining
> the cluster.
>
>
>
> Here’s another data point: “nodetool gossipinfo”, when run from the
> restarted node (“node001”) shows a status of “normal”:
>
>
>
> user@node001=> nodetool -u gossipinfo
>
> /192.168.187.121
>
>   generation:1574364410
>
>   heartbeat:209150
>
>   NET_VERSION:8
>
>   RACK:rack1
>
>   STATUS:NORMAL,-104847506331695918
>
>   RELEASE_VERSION:2.1.9
>
>   SEVERITY:0.0
>
>   LOAD:5.78684155614E11
>
>   HOST_ID:c99cf581-f4ae-4aa9-ab37-1a114ab2429b
>
>   SCHEMA:fd2dcb4b-ca62-30df-b8f2-d3fd774f2801
>
>   DC:datacenter1
>
>   RPC_ADDRESS:192.168.185.121
>
>
>
> When run from one of the other nodes, however, node001’s status is shown
> as “shutdown”:
>
>
>
> user@node002=> nodetool gossipinfo
>
> /192.168.187.121
>
>   generation:1491825076
>
>   heartbeat:2147483647
>
>   STATUS:shutdown,true
>
>   RACK:rack1
>
>   NET_VERSION:8
>

Re: No progress in compactionstats

2019-12-05 Thread Hossein Ghiyasi Mehr
Hi,
Check nodetool tpstats output.
*---*
*VafaTech  : A Total Solution for Data Gathering &
Analysis*
*---*


On Mon, Dec 2, 2019 at 10:56 AM Dipan Shah  wrote:

> Hello,
>
> I am running a 5 node Cassandra cluster on V 3.7 and did not understand
> why the following thing was happening. I had altered the compaction
> strategy of a table from Size to Leveled and while running "nodetool
> compactionstats" found that the SSTables were stuck and not getting
> compacted. This was happening on majority of the nodes while the remaining
> were still showing some progress at compacting the SSTables.
>
>
>
> There were no errors in system.log and a service restart also did not
> help. I reverted the compaction strategy to Size to see what happens and
> that sent the value of pending tasks back to 0.
>
> I have done this earlier for a similar tables and it has worked perfectly
> fine for me. What could have gone wrong over here?
>
> Thanks,
>
> Dipan Shah
>


Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-04 Thread Hossein Ghiyasi Mehr
"3. Though Datastax do not recommended and recommends Horizontal scale, so
based on your requirement alternate old fashion option is to add swap
space."
Hi Shishir,
swap isn't recommended by DataStax!

*---*
*VafaTech.com - A Total Solution for Data Gathering & Analysis*
*---*


On Tue, Dec 3, 2019 at 5:53 PM Shishir Kumar 
wrote:

> Options: Assuming model and configurations are good and Data size per node
> less than 1 TB (though no such Benchmark).
>
> 1. Infra scale for memory
> 2. Try to change disk_access_mode to mmap_index_only.
> In this case you should not have any in memory DB tables.
> 3. Though Datastax do not recommended and recommends Horizontal scale, so
> based on your requirement alternate old fashion option is to add swap space.
>
> -Shishir
>
> On Tue, 3 Dec 2019, 15:52 John Belliveau, 
> wrote:
>
>> Reid,
>>
>> I've only been working with Cassandra for 2 years, and this echoes my
>> experience as well.
>>
>> Regarding the cache use, I know every use case is different, but have you
>> experimented and found any performance benefit to increasing its size?
>>
>> Thanks,
>> John Belliveau
>>
>>
>> On Mon, Dec 2, 2019, 11:07 AM Reid Pinchback 
>> wrote:
>>
>>> Rahul, if my memory of this is correct, that particular logging message
>>> is noisy, the cache is pretty much always used to its limit (and why not,
>>> it’s a cache, no point in using less than you have).
>>>
>>>
>>>
>>> No matter what value you set, you’ll just change the “reached (….)” part
>>> of it.  I think what would help you more is to work with the team(s) that
>>> have apps depending upon C* and decide what your performance SLA is with
>>> them.  If you are meeting your SLA, you don’t care about noisy messages.
>>> If you aren’t meeting your SLA, then the noisy messages become sources of
>>> ideas to look at.
>>>
>>>
>>>
>>> One thing you’ll find out pretty quickly.  There are a lot of knobs you
>>> can turn with C*, too many to allow for easy answers on what you should
>>> do.  Figure out what your throughput and latency SLAs are, and you’ll know
>>> when to stop tuning.  Otherwise you’ll discover that it’s a rabbit hole you
>>> can dive into and not come out of for weeks.
>>>
>>>
>>>
>>>
>>>
>>> *From: *Hossein Ghiyasi Mehr 
>>> *Reply-To: *"user@cassandra.apache.org" 
>>> *Date: *Monday, December 2, 2019 at 10:35 AM
>>> *To: *"user@cassandra.apache.org" 
>>> *Subject: *Re: "Maximum memory usage reached (512.000MiB), cannot
>>> allocate chunk of 1.000MiB"
>>>
>>>
>>>
>>> *Message from External Sender*
>>>
>>> It may be helpful:
>>> https://thelastpickle.com/blog/2018/08/08/compression_performance.html
>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__thelastpickle.com_blog_2018_08_08_compression-5Fperformance.html=DwMFaQ=9Hv6XPedRSA-5PSECC38X80c1h60_XWA4z1k_R1pROA=OIgB3poYhzp3_A7WgD7iBCnsJaYmspOa2okNpf6uqWc=BlMYluADfxjSCocEBuEfptXuOJCAamgGaQreoJcMRJQ=rPo3nouxhBU2Yf2HRb2Udl87roS0KkGuPr-l2ferKXA=>
>>>
>>> It's complex. Simple explanation, cassandra keeps sstables in memory
>>> based on chunk size and sstable parts. It manage loading new sstables to
>>> memory based on requests on different sstables correctly . You should be
>>> worry about it (sstables loaded in memory)
>>>
>>>
>>> *VafaTech.com - A Total Solution for Data Gathering & Analysis*
>>>
>>>
>>>
>>>
>>>
>>> On Mon, Dec 2, 2019 at 6:18 PM Rahul Reddy 
>>> wrote:
>>>
>>> Thanks Hossein,
>>>
>>>
>>>
>>> How does the chunks are moved out of memory (LRU?) if it want to make
>>> room for new requests to get chunks?if it has mechanism to clear chunks
>>> from cache what causes to cannot allocate chunk? Can you point me to any
>>> documention?
>>>
>>>
>>>
>>> On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr <
>>> ghiyasim...@gmail.com> wrote:
>>>
>>> Chunks are part of sstables. When there is enough space in memory to
>>> cache them, read performance will increase if application requests it
>>> again.
>>>
>>>
>>>
>>> Your real answer is application dependent. For example write heavy
>>> applications are different than read heavy or read-write heavy. Real time
>>> applications are different than time series data environments and ... .
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy 
>>> wrote:
>>>
>>> Hello,
>>>
>>>
>>>
>>> We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I
>>> see this because file_cache_size_mb by default set to 512MB.
>>>
>>>
>>>
>>> Datastax document recommends to increase the file_cache_size.
>>>
>>>
>>>
>>> We have 32G over all memory allocated 16G to Cassandra. What is the
>>> recommended value in my case. And also when does this memory gets filled up
>>> frequent does nodeflush helps in avoiding this info messages?
>>>
>>>


Re: Optimal backup strategy

2019-12-03 Thread Hossein Ghiyasi Mehr
I am sorry! This is true. I forgot "*not*"!
1. It's *not* recommended to use commit log after one node failure.
Cassandra has many options such as replication factor as
substitute solution.

*VafaTech.com - A Total Solution for Data Gathering & Analysis*


On Tue, Dec 3, 2019 at 10:42 AM Adarsh Kumar  wrote:

> Thanks Hossein,
>
> Just one more question is there any special SOP or consideration we have
> to take for multi-site backup.
>
> Please share any helpful link, blog or steps documented.
>
> Regards,
> Adarsh Kumar
>
> On Sun, Dec 1, 2019 at 10:40 PM Hossein Ghiyasi Mehr <
> ghiyasim...@gmail.com> wrote:
>
>> 1. It's recommended to use commit log after one node failure. Cassandra
>> has many options such as replication factor as substitute solution.
>> 2. Yes, right.
>>
>> *VafaTech.com - A Total Solution for Data Gathering & Analysis*
>>
>>
>> On Fri, Nov 29, 2019 at 9:33 AM Adarsh Kumar 
>> wrote:
>>
>>> Thanks Ahu and Hussein,
>>>
>>> So my understanding is:
>>>
>>>1. Commit log backup is not documented for Apache Cassandra, hence
>>>not standard. But can be used for restore on the same machine (For taking
>>>backup from commit_log_dir). If used on other machine(s) has to be in the
>>>same topology. Can it be used for replacement node?
>>>2. For periodic backup Snapshot+Incremental backup is the best option
>>>
>>>
>>> Thanks,
>>> Adarsh Kumar
>>>
>>> On Fri, Nov 29, 2019 at 7:28 AM guo Maxwell 
>>> wrote:
>>>
>>>> Hossein is right , But for use , we restore to the same cassandra
>>>> topology ,So it is usable to do replay .But when restore to the
>>>> same machine it is also usable .
>>>> Using sstableloader cost too much time and more storage(though will
>>>> reduce after  restored)
>>>>
>>>> Hossein Ghiyasi Mehr  于2019年11月28日周四 下午7:40写道:
>>>>
>>>>> commitlog backup isn't usable in another machine.
>>>>> Backup solution depends on what you want to do: periodic backup or
>>>>> backup to restore on other machine?
>>>>> Periodic backup is combine of snapshot and incremental backup. Remove
>>>>> incremental backup after new snapshot.
>>>>> Take backup to restore on other machine: You can use snapshot after
>>>>> flushing memtable or Use sstableloader.
>>>>>
>>>>>
>>>>> 
>>>>> VafaTech.com - A Total Solution for Data Gathering & Analysis
>>>>>
>>>>> On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell 
>>>>> wrote:
>>>>>
>>>>>> for cassandra or datastax's documentation, commitlog's backup is not
>>>>>> mentioned.
>>>>>> only snapshot and incremental backup is described to do backup .
>>>>>>
>>>>>> Though commitlog's archive for keyspace/table is not support but
>>>>>> commitlog' replay (though you must put log to commitlog_dir and restart 
>>>>>> the
>>>>>> process)
>>>>>> support the feature of keyspace/table' replay filter (using
>>>>>> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format 
>>>>>> to
>>>>>> replay the specified keyspace/table)
>>>>>>
>>>>>> Snapshot do affect the storage, for us we got snapshot one week a
>>>>>> time under the low business peak and making snapshot got throttle ,for 
>>>>>> you
>>>>>> you may
>>>>>> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019)
>>>>>>
>>>>>>
>>>>>>
>>>>>> Adarsh Kumar  于2019年11月28日周四 上午1:00写道:
>>>>>>
>>>>>>> Thanks Guo and Eric for replying,
>>>>>>>
>>>>>>> I have some confusions about commit log backup:
>>>>>>>
>>>>>>>1. commit log archival technique is (
>>>>>>>
>>>>>>> https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-
>>>>>>>) as good as an incremental backup, as it also captures commit logs 
>>>>>>> after
>>>>>>>memtable flush.
>>>>>>>2. If we go for "Snapshot +

Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-02 Thread Hossein Ghiyasi Mehr
It may be helpful:
https://thelastpickle.com/blog/2018/08/08/compression_performance.html
It's complex. Simple explanation, cassandra keeps sstables in memory based
on chunk size and sstable parts. It manage loading new sstables to memory
based on requests on different sstables correctly . You should be worry
about it (sstables loaded in memory)

*VafaTech.com - A Total Solution for Data Gathering & Analysis*


On Mon, Dec 2, 2019 at 6:18 PM Rahul Reddy  wrote:

> Thanks Hossein,
>
> How does the chunks are moved out of memory (LRU?) if it want to make room
> for new requests to get chunks?if it has mechanism to clear chunks from
> cache what causes to cannot allocate chunk? Can you point me to any
> documention?
>
> On Sun, Dec 1, 2019, 12:03 PM Hossein Ghiyasi Mehr 
> wrote:
>
>> Chunks are part of sstables. When there is enough space in memory to
>> cache them, read performance will increase if application requests it again.
>>
>> Your real answer is application dependent. For example write heavy
>> applications are different than read heavy or read-write heavy. Real time
>> applications are different than time series data environments and ... .
>>
>>
>>
>> On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy 
>> wrote:
>>
>>> Hello,
>>>
>>> We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I
>>> see this because file_cache_size_mb by default set to 512MB.
>>>
>>> Datastax document recommends to increase the file_cache_size.
>>>
>>> We have 32G over all memory allocated 16G to Cassandra. What is the
>>> recommended value in my case. And also when does this memory gets filled up
>>> frequent does nodeflush helps in avoiding this info messages?
>>>
>>


Re: Optimal backup strategy

2019-12-01 Thread Hossein Ghiyasi Mehr
1. It's recommended to use commit log after one node failure. Cassandra has
many options such as replication factor as substitute solution.
2. Yes, right.

*VafaTech.com - A Total Solution for Data Gathering & Analysis*


On Fri, Nov 29, 2019 at 9:33 AM Adarsh Kumar  wrote:

> Thanks Ahu and Hussein,
>
> So my understanding is:
>
>1. Commit log backup is not documented for Apache Cassandra, hence not
>standard. But can be used for restore on the same machine (For taking
>backup from commit_log_dir). If used on other machine(s) has to be in the
>same topology. Can it be used for replacement node?
>2. For periodic backup Snapshot+Incremental backup is the best option
>
>
> Thanks,
> Adarsh Kumar
>
> On Fri, Nov 29, 2019 at 7:28 AM guo Maxwell  wrote:
>
>> Hossein is right , But for use , we restore to the same cassandra
>> topology ,So it is usable to do replay .But when restore to the
>> same machine it is also usable .
>> Using sstableloader cost too much time and more storage(though will
>> reduce after  restored)
>>
>> Hossein Ghiyasi Mehr  于2019年11月28日周四 下午7:40写道:
>>
>>> commitlog backup isn't usable in another machine.
>>> Backup solution depends on what you want to do: periodic backup or
>>> backup to restore on other machine?
>>> Periodic backup is combine of snapshot and incremental backup. Remove
>>> incremental backup after new snapshot.
>>> Take backup to restore on other machine: You can use snapshot after
>>> flushing memtable or Use sstableloader.
>>>
>>>
>>> 
>>> VafaTech.com - A Total Solution for Data Gathering & Analysis
>>>
>>> On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell 
>>> wrote:
>>>
>>>> for cassandra or datastax's documentation, commitlog's backup is not
>>>> mentioned.
>>>> only snapshot and incremental backup is described to do backup .
>>>>
>>>> Though commitlog's archive for keyspace/table is not support but
>>>> commitlog' replay (though you must put log to commitlog_dir and restart the
>>>> process)
>>>> support the feature of keyspace/table' replay filter (using
>>>> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format to
>>>> replay the specified keyspace/table)
>>>>
>>>> Snapshot do affect the storage, for us we got snapshot one week a time
>>>> under the low business peak and making snapshot got throttle ,for you you
>>>> may
>>>> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019)
>>>>
>>>>
>>>>
>>>> Adarsh Kumar  于2019年11月28日周四 上午1:00写道:
>>>>
>>>>> Thanks Guo and Eric for replying,
>>>>>
>>>>> I have some confusions about commit log backup:
>>>>>
>>>>>1. commit log archival technique is (
>>>>>
>>>>> https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-
>>>>>) as good as an incremental backup, as it also captures commit logs 
>>>>> after
>>>>>memtable flush.
>>>>>2. If we go for "Snapshot + Incremental bk + Commit log", here we
>>>>>have to take commit log from commit log directory (is there any SOP for
>>>>>this?). As commit logs are not per table or ks, we will have chalange 
>>>>> in
>>>>>restoring selective tables.
>>>>>3. Snapshot based backups are easy to manage and operate due to
>>>>>its simplicity. But they are heavy on storage. Any views on this?
>>>>>4. Please share any successful strategy that someone is using for
>>>>>production. We are still in the design phase and want to implement the 
>>>>> best
>>>>>solution.
>>>>>
>>>>> Thanks Eric for sharing link for medusa.
>>>>>
>>>>> Regards,
>>>>> Adarsh Kumar
>>>>>
>>>>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell 
>>>>> wrote:
>>>>>
>>>>>> For me, I think the last one :
>>>>>>  Snapshot + Incremental + commitlog
>>>>>> is the most meaningful way to do backup and restore, when you make
>>>>>> the data backup to some where else like AWS S3.
>>>>>>
>>>>>>- Snapshot based backup // for increment

Re: "Maximum memory usage reached (512.000MiB), cannot allocate chunk of 1.000MiB"

2019-12-01 Thread Hossein Ghiyasi Mehr
Chunks are part of sstables. When there is enough space in memory to cache
them, read performance will increase if application requests it again.

Your real answer is application dependent. For example write heavy
applications are different than read heavy or read-write heavy. Real time
applications are different than time series data environments and ... .



On Sun, Dec 1, 2019 at 7:09 PM Rahul Reddy  wrote:

> Hello,
>
> We are seeing memory usage reached 512 mb and cannot allocate 1MB.  I see
> this because file_cache_size_mb by default set to 512MB.
>
> Datastax document recommends to increase the file_cache_size.
>
> We have 32G over all memory allocated 16G to Cassandra. What is the
> recommended value in my case. And also when does this memory gets filled up
> frequent does nodeflush helps in avoiding this info messages?
>


Re: Optimal backup strategy

2019-11-28 Thread Hossein Ghiyasi Mehr
commitlog backup isn't usable in another machine.
Backup solution depends on what you want to do: periodic backup or backup
to restore on other machine?
Periodic backup is combine of snapshot and incremental backup. Remove
incremental backup after new snapshot.
Take backup to restore on other machine: You can use snapshot after
flushing memtable or Use sstableloader.



VafaTech.com - A Total Solution for Data Gathering & Analysis

On Thu, Nov 28, 2019 at 6:05 AM guo Maxwell  wrote:

> for cassandra or datastax's documentation, commitlog's backup is not
> mentioned.
> only snapshot and incremental backup is described to do backup .
>
> Though commitlog's archive for keyspace/table is not support but
> commitlog' replay (though you must put log to commitlog_dir and restart the
> process)
> support the feature of keyspace/table' replay filter (using
> -Dcassandra.replayList with the keyspace1.table1,keyspace1.table2 format to
> replay the specified keyspace/table)
>
> Snapshot do affect the storage, for us we got snapshot one week a time
> under the low business peak and making snapshot got throttle ,for you you
> may
> see the issue (https://issues.apache.org/jira/browse/CASSANDRA-13019)
>
>
>
> Adarsh Kumar  于2019年11月28日周四 上午1:00写道:
>
>> Thanks Guo and Eric for replying,
>>
>> I have some confusions about commit log backup:
>>
>>1. commit log archival technique is (
>>
>> https://support.datastax.com/hc/en-us/articles/115001593706-Manual-Backup-and-Restore-with-Point-in-time-and-table-level-restore-
>>) as good as an incremental backup, as it also captures commit logs after
>>memtable flush.
>>2. If we go for "Snapshot + Incremental bk + Commit log", here we
>>have to take commit log from commit log directory (is there any SOP for
>>this?). As commit logs are not per table or ks, we will have chalange in
>>restoring selective tables.
>>3. Snapshot based backups are easy to manage and operate due to its
>>simplicity. But they are heavy on storage. Any views on this?
>>4. Please share any successful strategy that someone is using for
>>production. We are still in the design phase and want to implement the 
>> best
>>solution.
>>
>> Thanks Eric for sharing link for medusa.
>>
>> Regards,
>> Adarsh Kumar
>>
>> On Wed, Nov 27, 2019 at 5:16 PM guo Maxwell  wrote:
>>
>>> For me, I think the last one :
>>>  Snapshot + Incremental + commitlog
>>> is the most meaningful way to do backup and restore, when you make the
>>> data backup to some where else like AWS S3.
>>>
>>>- Snapshot based backup // for incremental data will not be backuped
>>>and may lose data when restore to the time latter than snapshot time;
>>>- Incremental backups // better than snapshot backup .but
>>>with Insufficient data accuracy. For data remain in the memtable will be
>>>lose;
>>>- Snapshot + incremental
>>>- Snapshot + commitlog archival // better data precision than made
>>>incremental backup, but the data in the non archived commitlog(not 
>>> archive
>>>and commitlog log not closed) will not restore and will lose. Also when 
>>> log
>>>is too much, do log reply will cost very mucu time
>>>
>>> For me ,We use snapshot + incremental + commitlog archive. We read
>>> snapshot data and incremental data .Also the log is backuped .But we will
>>> not backup the
>>> log whose data have been flush to sstable ,for the data will be backuped
>>> by the way we do incremental backup .
>>>
>>> This way , the data will exist in the format of sstable trough snapshot
>>> backup and incremental backup . The log number will be very small .And log
>>> replay will not cost much time.
>>>
>>>
>>>
>>> Eric LELEU  于2019年11月27日周三 下午4:13写道:
>>>
 Hi,
 TheLastPickle & Spotify have released Medusa as Cassandra Backup tool.

 See :
 https://thelastpickle.com/blog/2019/11/05/cassandra-medusa-backup-tool-is-open-source.html

 Hope this link will help you.

 Eric


 Le 27/11/2019 à 08:10, Adarsh Kumar a écrit :

 Hi,

 I was looking for the backup strategies of Cassandra. After some study
 I came to know that there are the following options:

- Snapshot based backup
- Incremental backups
- Snapshot + incremental
- Snapshot + commitlog archival
- Snapshot + Incremental + commitlog

 Which is the most suitable and feasible approach? Also which of these
 is used most.
 Please let me know if there is any other option to tool available.

 Thanks in advance.

 Regards,
 Adarsh Kumar


>>>
>>> --
>>> you are the apple of my eye !
>>>
>>
>
> --
> you are the apple of my eye !
>


Re: Select statement in batch

2019-10-26 Thread Hossein Ghiyasi Mehr
Hello,
Batch isn't for selet only query, it's for transactional queries. If you
want to read data, you should use select query (prepared or simple or etc.)

On Fri, Oct 11, 2019 at 5:50 PM Inquistive allen 
wrote:

> Hello Team,
>
> Wanted to understand the impacted of using a select statement inside a
> batch.
> I keep seeing some slow queries frequently in the logs.
>
> Please comment on what may the impact of the same. Is it the right
> practice. Will a select statement in batch be lead to increase in read
> latency than a normal select prepared statement.
>
> Thanks,
> Allen
>


Re: TWCS and gc_grace_seconds

2019-10-26 Thread Hossein Ghiyasi Mehr
It needs to change gc_grace_seconds carefully because it has side effect on
hinted handoff.

On Fri, Oct 18, 2019 at 5:04 PM Paul Chandler  wrote:

> Hi Adarsh,
>
> You will have problems if you manually delete data when using TWCS.
>
> To fully understand why, I recommend reading this The Last Pickle post:
> https://thelastpickle.com/blog/2016/12/08/TWCS-part1.html
> And this post I wrote that dives deeper into the problems with deletes:
> http://www.redshots.com/cassandra-twcs-must-have-ttls/
>
> Thanks
>
> Paul
>
> On 18 Oct 2019, at 14:22, Adarsh Kumar  wrote:
>
> Thanks Jeff,
>
>
> I just checked with business and we have differences in having TTL. So it
> will be manula purging always. We do not want to use LCS due to high IOs.
> So:
>
>1. As the use case is of time series data model, TWCS will be give
>some benefit (without TTL) and with frequent deleted data
>2. Are there any best practices/recommendations to handle high number
>of tombstones
>3. Can we handle this use case  with STCS also (with some
>configurations)
>
>
> Thanks in advance
>
> Adarsh Kumar
>
> On Fri, Oct 18, 2019 at 11:46 AM Jeff Jirsa  wrote:
>
>> Is everything in the table TTL’d?
>>
>> Do you do explicit deletes before the data is expected to expire ?
>>
>> Generally speaking, gcgs exists to prevent data resurrection. But ttl’d
>> data can’t be resurrected once it expires, so gcgs has no purpose unless
>> you’re deleting it before the ttl expires. If you’re doing that, twcs won’t
>> be able to drop whole sstables anyway, so maybe LCS will be less disk usage
>> (but much higher IO)
>>
>> On Oct 17, 2019, at 10:36 PM, Adarsh Kumar  wrote:
>>
>> 
>> Hi,
>>
>> We have a use case of time series data with TTL where we want to use
>> TimeWindowCompactionStrategy because of its better management for TTL and
>> tombstones. In this case, data we have is frequently deleted so we want to
>> reduce gc_grace_seconds to reduce the tombstones' life and reduce pressure
>> on storage. I have following questions:
>>
>>1. Do we always need to run repair for the table in reduced
>>gc_grace_seconds or there is any other way to manage repairs in this vase
>>2. Do we have any other strategy (or combination of strategies) to
>>manage frequently deleted time-series data
>>
>> Thanks in advance.
>>
>> Adarsh Kumar
>>
>>
>


Re: Repair Issues

2019-10-26 Thread Hossein Ghiyasi Mehr
If the problem exist still, and all nodes are up, reboot them one by one.
Then try to repair one node. After that repair other nodes one by one.

On Fri, Oct 25, 2019 at 12:56 AM Ben Mills  wrote:

>
> Thanks Jon!
>
> This is very helpful - allow me to follow-up and ask a question.
>
> (1) Yes, incremental repairs will never be used (unless it becomes viable
> in Cassandra 4.x someday).
> (2) I hear you on the JVM - will look into that.
> (3) Been looking at Cassandra version 3.11.x though was unaware that 3.7
> is considered non-viable for production use.
>
> For (4) - Question/Request:
>
> Note that with:
>
> -XX:MaxRAMFraction=2
>
> the actual amount of memory allocated for heap space is effectively 2Gi
> (i.e. half of the 4Gi allocated on the machine type). We can definitely
> increase memory (for heap and nonheap), though can you expand a bit on your
> heap comment to help my understanding (as this is such a small cluster with
> such a small amount of data at rest)?
>
> Thanks again.
>
> On Thu, Oct 24, 2019 at 5:11 PM Jon Haddad  wrote:
>
>> There's some major warning signs for me with your environment.  4GB heap
>> is too low, and Cassandra 3.7 isn't something I would put into production.
>>
>> Your surface area for problems is massive right now.  Things I'd do:
>>
>> 1. Never use incremental repair.  Seems like you've already stopped doing
>> them, but it's worth mentioning.
>> 2. Upgrade to the latest JVM, that version's way out of date.
>> 3. Upgrade to Cassandra 3.11.latest (we're voting on 3.11.5 right now).
>> 4. Increase memory to 8GB minimum, preferably 12.
>>
>> I usually don't like making a bunch of changes without knowing the root
>> cause of a problem, but in your case there's so many potential problems I
>> don't think it's worth digging into, especially since the problem might be
>> one of the 500 or so bugs that were fixed since this release.
>>
>> Once you've done those things it'll be easier to narrow down the problem.
>>
>> Jon
>>
>>
>> On Thu, Oct 24, 2019 at 4:59 PM Ben Mills  wrote:
>>
>>> Hi Sergio,
>>>
>>> No, not at this time.
>>>
>>> It was in use with this cluster previously, and while there were no
>>> reaper-specific issues, it was removed to help simplify investigation of
>>> the underlying repair issues I've described.
>>>
>>> Thanks.
>>>
>>> On Thu, Oct 24, 2019 at 4:21 PM Sergio 
>>> wrote:
>>>
 Are you using Cassandra reaper?

 On Thu, Oct 24, 2019, 12:31 PM Ben Mills  wrote:

> Greetings,
>
> Inherited a small Cassandra cluster with some repair issues and need
> some advice on recommended next steps. Apologies in advance for a long
> email.
>
> Issue:
>
> Intermittent repair failures on two non-system keyspaces.
>
> - platform_users
> - platform_management
>
> Repair Type:
>
> Full, parallel repairs are run on each of the three nodes every five
> days.
>
> Repair command output for a typical failure:
>
> [2019-10-18 00:22:09,109] Starting repair command #46, repairing
> keyspace platform_users with repair options (parallelism: parallel, 
> primary
> range: false, incremental: false, job threads: 1, ColumnFamilies: [],
> dataCenters: [], hosts: [], # of ranges: 12)
> [2019-10-18 00:22:09,242] Repair session
> 5282be70-f13d-11e9-9b4e-7f6db768ba9a for range
> [(-1890954128429545684,2847510199483651721],
> (8249813014782655320,-8746483007209345011],
> (4299912178579297893,6811748355903297393],
> (-8746483007209345011,-8628999431140554276],
> (-5865769407232506956,-4746990901966533744],
> (-4470950459111056725,-1890954128429545684],
> (4001531392883953257,4299912178579297893],
> (6811748355903297393,6878104809564599690],
> (6878104809564599690,8249813014782655320],
> (-4746990901966533744,-4470950459111056725],
> (-8628999431140554276,-5865769407232506956],
> (2847510199483651721,4001531392883953257]] failed with error [repair
> #5282be70-f13d-11e9-9b4e-7f6db768ba9a on platform_users/access_tokens_v2,
> [(-1890954128429545684,2847510199483651721],
> (8249813014782655320,-8746483007209345011],
> (4299912178579297893,6811748355903297393],
> (-8746483007209345011,-8628999431140554276],
> (-5865769407232506956,-4746990901966533744],
> (-4470950459111056725,-1890954128429545684],
> (4001531392883953257,4299912178579297893],
> (6811748355903297393,6878104809564599690],
> (6878104809564599690,8249813014782655320],
> (-4746990901966533744,-4470950459111056725],
> (-8628999431140554276,-5865769407232506956],
> (2847510199483651721,4001531392883953257]]] Validation failed in /10.x.x.x
> (progress: 26%)
> [2019-10-18 00:22:09,246] Some repair failed
> [2019-10-18 00:22:09,248] Repair command #46 finished in 0 seconds
>
> Additional Notes:
>
> Repairs encounter above failures more often than not. Sometimes on one
> node only, 

Re: [EXTERNAL] Cassandra Export error in COPY command

2019-10-24 Thread Hossein Ghiyasi Mehr
I tested dsbulk too. But there are many errors:

"[1710949318] Error writing cancel request. This is not critical (the
request will eventually time out server-side)."
"Forcing termination of Connection[/127.0.0.1:9042-14, inFlight=1,
closed=true]. This should not happen and is likely a bug, please report."

I tried with "--executor.maxPerSecond 1000" and
"--driver.socket.readTimeout 3600" options but errors didn't resolve.

How can I export a table data?

On Mon, Sep 23, 2019 at 6:30 AM Durity, Sean R 
wrote:

> Copy command tries to export all rows in the table, not just the ones on
> the node. It will eventually timeout if the table is large. It is really
> built for something under 5 million rows or so. Dsbulk (from DataStax) is
> great for this, if you are a customer. Otherwise, you will probably need to
> write an extract of some kind. You can get keys from the sstables, then
> dedupe, then export rows one by one using the keys (kind of painful). How
> large is the table you are trying to export?
>
>
>
> Sean Durity
>
>
>
> *From:* Hossein Ghiyasi Mehr 
> *Sent:* Saturday, September 21, 2019 8:02 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Cassandra Export error in COPY command
>
>
>
> Hi all members,
>
> I want to export (pk, another_int_column) from single node using COPY
> command. But after about 1h 45m, I've got a lot of read errors:
>
>
>
>
>
> I tried this action many times but after maximum 2h, it failed with the
> errors.
>
>
>
> Any idea may help me!
>
> Thanks.
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>


Re: [EXTERNAL] Cassandra Export error in COPY command

2019-09-23 Thread Hossein Ghiyasi Mehr
The table has more than 10 M rows. I used COPY command in a cluster with
five machine for this table and everything was OK.
I took a backup to a single machine using sstableloader.
Now I want to extract rows using COPY command but I can't!

On Mon, Sep 23, 2019 at 6:30 AM Durity, Sean R 
wrote:

> Copy command tries to export all rows in the table, not just the ones on
> the node. It will eventually timeout if the table is large. It is really
> built for something under 5 million rows or so. Dsbulk (from DataStax) is
> great for this, if you are a customer. Otherwise, you will probably need to
> write an extract of some kind. You can get keys from the sstables, then
> dedupe, then export rows one by one using the keys (kind of painful). How
> large is the table you are trying to export?
>
>
>
> Sean Durity
>
>
>
> *From:* Hossein Ghiyasi Mehr 
> *Sent:* Saturday, September 21, 2019 8:02 AM
> *To:* user@cassandra.apache.org
> *Subject:* [EXTERNAL] Cassandra Export error in COPY command
>
>
>
> Hi all members,
>
> I want to export (pk, another_int_column) from single node using COPY
> command. But after about 1h 45m, I've got a lot of read errors:
>
>
>
>
>
> I tried this action many times but after maximum 2h, it failed with the
> errors.
>
>
>
> Any idea may help me!
>
> Thanks.
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>


Cassandra Export error in COPY command

2019-09-21 Thread Hossein Ghiyasi Mehr
Hi all members,
I want to export (pk, another_int_column) from single node using COPY
command. But after about 1h 45m, I've got a lot of read errors:

[image: image.png]

I tried this action many times but after maximum 2h, it failed with the
errors.

Any idea may help me!
Thanks.


Re: Update/where statement Adds Row

2019-09-12 Thread Hossein Ghiyasi Mehr
Update in Cassandra is upsert (update or insert). So when you update a row
which isn't exist, it will create it.
"IF EXIST" can be used in some queries.

On Thu, Sep 12, 2019 at 8:35 AM A  wrote:

> I have an update statement that has a where clause with the primary key
> (email,companyid).
>
> When executed it always creates a new row. It’s like it’s not finding the
> existing row with the primary key.
>
> I’m using Cassandra-driver.
>
> What am I doing wrong? I don’t want a new row. Why doesn’t it seem to be
> using the where clause to identify the existing row?
>
> Thanks,
> Angel
>
>
>
> Sent from Yahoo Mail for iPhone
> 
>


Fwd: Cassandra Export error in COPY command

2019-09-11 Thread Hossein Ghiyasi Mehr
-- Forwarded message -
From: Hossein Ghiyasi Mehr 
Date: Sun, Sep 8, 2019 at 11:38 AM
Subject: Cassandra Export error in COPY command
To: 


Hi all members,
I want to export (pk, another_int_column) from single node using COPY
command. But after about 1h 45m, I've got a lot of read errors:

[image: image.png]

I tried this action many times but after maximum 2h, it failed with the
errors. cql timeout param is set correctly.

Any idea may help me!
Thanks.