Re: Call for Presentations closing TOMORROW: Community over Code EU 2024

2024-01-11 Thread Mick Semb Wever
The CFP for the Cassandra track at the Community Over Code EU conference, June in Bratislava, closes tomorrow (Friday) !! We'd love to hear your Cassandra experience, operating or coding. Submit before it's too late 拾 see you there, Mick On Mon, 8 Jan 2024 at 20:24, Paulo Motta wrote: > I

[RELEASE] Apache Cassandra Java Driver 4.18.0 released

2023-12-12 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Cassandra Java Driver version 4.18.0 The Source release and Binary convenience artifacts are available here: https://dist.apache.org/repos/dist/release/cassandra/cassandra-java-driver/4.18.0/ The Maven artifacts can be found at:

[RELEASE] Apache Cassandra 5.0-beta1 released

2023-12-05 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 5.0-beta1. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. Downloads of source and binary distributions are

Re: Cassandra Summit: Engage those networks!

2023-11-30 Thread Mick Semb Wever
Looking forward to seeing you all! Cassandra 5* has so many game changing features in it, I'm super excited. On Thu, 30 Nov 2023 at 07:55, Bhagdev, Meet wrote: > I’m going and hope to see you there  > > > > Cheers, > > Meet > > *From: *Paulo Motta > *Reply-To: *"user@cassandra.apache.org"

[RELEASE] Apache Cassandra 5.0-alpha2 released

2023-11-04 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 5.0-alpha2. This release contains Vector Similarity Search (CEP-30). http://cassandra.apache.org/ Downloads of source and binary distributions are listed in our download section:

[RELEASE] Apache Cassandra 5.0-alpha1 released

2023-09-08 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 5.0-alpha1. DISCLAIMER, this alpha release does not contain the expected 5.0 features: Vector Search (CEP-30), Transactional Cluster Metadata (CEP-21) and Accord Transactions (CEP-15). These features will land in a

[RELEASE] Apache Cassandra 4.1.2 released

2023-05-29 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.1.2. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 4.0.10 released

2023-05-29 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0.10. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

Re: JIRA account creation request

2023-02-15 Thread Mick Semb Wever
> HI Mick, > > Could you pls. help with JIRA account for me as well ? > Done Srinivas. You should have received an email. Welcome to the Cassandra community.

Re: JIRA account creation request

2023-02-15 Thread Mick Semb Wever
> I would like to get my JIRA account created as I would like to contribute. > Here are my details > > email address : manishkhandelwa...@gmail.com > Your jira account has been created. You should have received an email. regards, Mick

Re: Upgrading Cassandra 3.11.14 → 4.1

2023-01-24 Thread Mick Semb Wever
On Mon, 16 Jan 2023 at 14:38, Lapo Luchini wrote: > is upgrading Cassandra 3.11.14 → 4.1 supported, > 3.11.14 → 4.1 is supported. It is recommended to go to the last patch version (i.e. 3.11.14) before the major upgrade. Make sure to ensure all sstables are upgraded to the current format

Re: [DISCUSS] Formation of Apache Cassandra Publicity & Marketing Group

2023-01-24 Thread Mick Semb Wever
The market...@cassandra.apache.org list is created. To subscribe send an email to marketing-subscr...@cassandra.apache.org from the email address you want to subscribe from. If you are a committer you can alternately use Whimsy: https://whimsy.apache.org/committers/subscribe regards, Mick On

[RELEASE] Apache Cassandra 4.1.0 GA released

2022-12-13 Thread Mick Semb Wever
The Cassandra team is pleased to announce the GA release of Apache Cassandra version 4.1.0. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of

[RELEASE] Apache Cassandra 4.1-rc1 released

2022-11-22 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.1-rc1. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 4.0.7 released

2022-10-23 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0.7. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.11.14 released

2022-10-23 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.11.14. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.0.28 released

2022-10-23 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.0.28. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

Re: [RELEASE] Apache Cassandra 4.1-beta1 released

2022-10-12 Thread Mick Semb Wever
Correction… Downloads of source and binary distributions are listed in our download > section: > > http://cassandra.apache.org/download/ > The source and binary distributions are to be found here: https://downloads.apache.org/cassandra/4.1-beta1/ (4.1 won't appear on our downloads page until

[RELEASE] Apache Cassandra 4.1-beta1 released

2022-10-05 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.1-beta1. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of

[ANNOUNCE] Debian and RedHat package repositories are moving!

2022-08-26 Thread Mick Semb Wever
Your Debian `cassandra.sources.list` and RedHat `cassandra.repo` files must be updated to the new repository URLs. The Debian file is typically at `/etc/apt/sources.list.d/cassandra.sources.list`. The RedHat file is typically at `/etc/yum.repos.d/cassandra.repo`. For Debian the repository is now

[RELEASE] Apache Cassandra 4.0.6 released

2022-08-25 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0.6. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 4.0.5 released

2022-07-18 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0.5. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

Re: [RELEASE] Apache Cassandra 4.1-alpha1 released

2022-05-30 Thread Mick Semb Wever
> > Downloads of source and binary distributions are listed in our download > section: > > http://cassandra.apache.org/download/ > > This version is the first alpha release[1] on the 4.1 series. As always, > please pay attention to the release notes[2] and Let us know[3] if you were > to

[RELEASE] Apache Cassandra 4.1-alpha1 released

2022-05-27 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.1-alpha1. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of

Last week to submit a talk to ApacheCon New Orleans and the Cassandra track

2022-05-17 Thread Mick Semb Wever
ApacheCon North America will be held October 3-6, at the Sheraton Hotel in New Orleans. The CFP closes this weekend! https://www.apachecon.com/acna2022/cfp.html It will be fantastic to catch up with as many of you as possible. Even better will be the talks you share with us, but you gotta

[RELEASE] Apache Cassandra 4.0.4 released

2022-05-13 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0.4. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.11.13 released

2022-05-13 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.11.13. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.0.27 released

2022-05-13 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.0.27. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

Applications for Travel Assistance to ApacheCon NA 2022 now open

2022-04-20 Thread Mick Semb Wever
(On behalf of the TAC) The ASF Travel Assistance Committee (TAC) is pleased to announce that travel assistance applications for ApacheCon NA 2022 are now open! We will be supporting ApacheCon North America in New Orleans, Louisiana, on October 3rd through 6th, 2022. TAC exists to help those

Cassandra track Call for Papers. ApacheCon NA October 3-6, 2022

2022-04-12 Thread Mick Semb Wever
We are excited to announce that the upcoming ApacheCon North America will have a two day Cassandra track. ApacheCon North America will be held October 3-6, at the Sheraton Hotel in New Orleans. The CFP is now open, and will be until May 23rd. We are interested in all talks with anything related

[RELEASE] Apache Cassandra 4.0.3 released

2022-02-16 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0.3. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 4.0.2 released

2022-02-11 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0.2. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.11.12 released

2022-02-11 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.11.12. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.0.26 released

2022-02-11 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.0.26. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

Re: 4.1 Release Date

2022-01-31 Thread Mick Semb Wever
> > Apache Cassandra 3.0 > Released on 2021-02-01, and supported until 4.1 release > (April 2022). > Would the wording "… and supported until 4.1.0 release (May-June 2022)." be enough? (it would be nice to keep the text brief on this page) If you would like to… this is

[RELEASE] Apache Cassandra 4.0-rc2 released

2021-06-30 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0-rc2. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 4.0-rc1 released

2021-04-25 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0-rc1. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 4.0-beta4 released

2020-12-30 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0-beta4. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of

Re: Enable Ttracing

2020-11-30 Thread Mick Semb Wever
> I just took a cursory look at the presentation and Zipkin.io. Would using > Zipkin degrade performance? Would it be considerable? > In comparison, no. > > Traces (spans) are immediately off-threaded into a Kafka Zipkin transport, and then the Zipkin server has its own Cassandra cluster. This

Re: Enable Ttracing

2020-11-29 Thread Mick Semb Wever
> I have a feeling that this tool will give me hell.  > I'll just have to wait till they implement it and monitor the clusters, > but at least I know what to expect. > The tracing implementation is pluggable in 3.11. For example you can push traces into Zipkin (and a separate C* cluster) using

[RELEASE] Apache Cassandra 4.0-beta3 released

2020-11-04 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0-beta3. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of

[RELEASE] Apache Cassandra 3.0.23 released

2020-11-04 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.0.23. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.11.9 released

2020-11-04 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.11.9. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 2.2.19 released

2020-11-04 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 2.2.19. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 4.0-beta2 released

2020-08-31 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0-beta2. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of

[RELEASE] Apache Cassandra 3.0.22 released

2020-08-31 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.0.22. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.11.8 released

2020-08-31 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.11.8. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 2.2.18 released

2020-08-31 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 2.2.18. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 2.1.22 released

2020-08-31 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 2.1.22. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.0.21 released

2020-07-29 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.0.21. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 3.11.7 released

2020-07-24 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 3.11.7. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

[RELEASE] Apache Cassandra 2.2.17 released

2020-07-24 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 2.2.17. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of source

Re: [RELEASE] Apache Cassandra 4.0-beta1 released

2020-07-24 Thread Mick Semb Wever
> This version is a beta release[1] on the 4.0 series. As always, please > pay attention to the release notes[2] and let us know[3] if you were > to encounter any problem. A quick followup note to both user and dev groups. Our Beta release guidelines¹ states that there will be no further API

[RELEASE] Apache Cassandra 4.0-beta1 released

2020-07-20 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0-beta1. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of

[RELEASE] Apache Cassandra 4.0-alpha4 released

2020-04-24 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0-alpha4. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads of

[RELEASE] Apache Cassandra 4.0-alpha3 released

2020-02-07 Thread Mick Semb Wever
The Cassandra team is pleased to announce the release of Apache Cassandra version 4.0-alpha3. Apache Cassandra is a fully distributed database. It is the right choice when you need scalability and high availability without compromising performance. http://cassandra.apache.org/ Downloads

JOB | The Last Pickle (Consultant) in USA

2019-11-20 Thread Mick Semb Wever
The Last Pickle is hiring in the US: https://thelastpickle.com/blog/2019/10/24/tlp-is-hiring-another-consultant.html If you enjoy Cassandra like we do, and are keen to join our team, reach out (see details in link above). regards, Mick

Re: Cassandra: Inconsistent data on reads (LOCAL_QUORUM)

2018-10-24 Thread Mick Semb Wever
t reads that Jeff's already pointed out, is the use of `speculative_retry='ALWAYS'`. Has there topology changes in your cluster recently? Next step would be to try and repeat it with tracing. regards, Mick -- Mick Semb Wever Australia The Last Pickle Apache Cassandra Consulting h

Re: Cassandra: Inconsistent data on reads (LOCAL_QUORUM)

2018-10-20 Thread Mick Semb Wever
> Thanks James. Yeah, we're using the datastax java driver. But we're on > version 2.1.10.2. And we are not using the client side timestamps. Just to check Ninad. If you are using Cassandra-2.1 (native protocol v3) and the java driver version 3.0 or above, then you would be using client-side

Re: Reaper 1.2 released

2018-07-25 Thread Mick Semb Wever
Feel free to file issues at https://github.com/thelastpickle/cassandra-reaper/issues or chat with us at https://gitter.im/thelastpickle/cassandra-reaper regards, Mick On Thu, 26 Jul 2018, at 06:18, Abdul Patel wrote: > Was abke start it but unable to start any repair manually it says >

Re: Inconsistent Quorum Read after Quorum Write

2018-07-11 Thread Mick Semb Wever
ant help from the open source community, none of us really enjoy debugging old code :-) regards, Mick -- Mick Semb Wever Australia The Last Pickle Apache Cassandra Consulting http://www.thelastpickle.com

Re: Inconsistent Quorum Read after Quorum Write

2018-07-11 Thread Mick Semb Wever
essages or flapping nodes won't help. I'd also be prepared to upgrade to 3.11.3, when it does get released. regards, Mick -- Mick Semb Wever Australia The Last Pickle Apache Cassandra Consulting http://www.thelastpickle.com

Re: How to Protect Tracing Requests From Client Side

2018-03-22 Thread Mick Semb Wever
ivate static final NoOpTraceState INSTANCE = new NoOpTraceState(); private NoOpTraceState() { super(FBUtilities.getBroadcastAddress(), UUID.randomUUID(), TraceType.NONE); } @Override protected void traceImpl(String message) {} } } ``` regards, Mick -

Re: 3.0.15 or 3.11.1

2018-01-08 Thread Mick Semb Wever
> > Can you please provide dome JIRAs for superior fixes and performance > improvements which are present in 3.11.1 but are missing in 3.0.15. > Some that come to mind… Cassandra Storage Engine: CASSANDRA-12269, CASSANDRA-12731 Streaming and Compaction: CASSANDRA-11206, CASSANDRA-9766,

Re: Why does SASI index consume such a huge disk space?

2018-01-03 Thread Mick Semb Wever
> I use zipkin (https://github.com/openzipkin/zipkin) to trace my system. > > When I upgraded to the latest version ,3.23 be specific. I met a problem which our monitor keep alerting that there is not enough disk space for cassandra. You're right. CONTAINS SASI indexes do indeed use a lot of

Re: 3.0.15 or 3.11.1

2018-01-03 Thread Mick Semb Wever
> > I want to upgrade from 2.x to 3.x. > > I can definitely use the features in 3.11.1 but it's not a must. > So my question is, is 3.11.1 stable and suitable for Production compared > to 3.0.15? > Use 3.11.1 and don't use any 3.0.x or 3.x features. 3.11.1 is effectively three sequential patch

Re: LegacySchemaTables.createKeyspaceFromSchemaPartition fails with an IllegalStateException

2016-07-31 Thread Mick Semb Wever
; ~[apache-cassandra-2.2.7.jar:2.2.7-SNAPSHOT] > at > org.apache.cassandra.schema.LegacySchemaTables.readSchemaFromSystemTables(LegacySchemaTables.java:219) > ~[apache-cassandra-2.2.7.jar:2.2.7-SNAPSHOT] > Soto, I've created the following issue for this – https://issues.apache.org/j

Re: CASSANDRA-2388 - ColumnFamilyRecordReader fails for a given split because a host is down

2012-03-16 Thread Mick Semb Wever
Sorry for such a late reply. I'm not always keeping up with the mailing list. Is the following scenario covered by 2388? I have a test cluster of 6 nodes with a replication factor of 3. Each server can execute hadoop tasks. 1 cassandra node is down for the test. The job is kicked off from

Re: OOM opening bloom filter

2012-03-13 Thread Mick Semb Wever
How much smaller did the BF get to ? After pending compactions completed today, i'm presuming fp_ratio is applied now to all sstables in the keyspace, it has gone from 20G+ down to 1G. This node is now running comfortably on Xmx4G (used heap ~1.5G). ~mck -- A Microsoft Certified System

Re: OOM opening bloom filter

2012-03-12 Thread Mick Semb Wever
It's my understanding then for this use case that bloom filters are of little importance and that i can Ok. To summarise our actions to get us out of this situation, in hope that it may help others one day, we did the following actions: 1) upgrade to 1.0.7 2) set fp_ratio=0.99 3)

OOM opening bloom filter

2012-03-11 Thread Mick Semb Wever
Using cassandra-1.0.6 one node fails to start. java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.utils.obs.OpenBitSet.init(OpenBitSet.java:104) at org.apache.cassandra.utils.obs.OpenBitSet.init(OpenBitSet.java:92) at

Re: OOM opening bloom filter

2012-03-11 Thread Mick Semb Wever
On Sun, 2012-03-11 at 15:06 -0700, Peter Schuller wrote: If it is legitimate use of memory, you *may*, depending on your workload, want to adjust target bloom filter false positive rates: https://issues.apache.org/jira/browse/CASSANDRA-3497 This particular cf has up to ~10 billion rows

Re: OOM opening bloom filter

2012-03-11 Thread Mick Semb Wever
On Sun, 2012-03-11 at 15:36 -0700, Peter Schuller wrote: Are you doing RF=1? That is correct. So are you calculations then :-) very small, 1k. Data from this cf is only read via hadoop jobs in batch reads of 16k rows at a time. [snip] It's my understanding then for this use case that

memory problems still post- CASSANDRA-3492

2011-11-15 Thread Mick Semb Wever
I've got a following problem to CASSANDRA-3492, also related to ridiculously high memory. After the fix yesterday for CASSANDRA-3492 I have that node in question up and running. But another node (on the same machine but different cluster), even after an upgrade to the staging 1.0.3 and a

Re: range slice with TimeUUID column names

2011-11-13 Thread Mick Semb Wever
On Thu, 2011-11-10 at 22:35 -0800, footh wrote: UUID startId = new UUID(UUIDGen.createTime(start), UUIDGen.getClockSeqAndNode()); UUID finishId = new UUID(UUIDGen.createTime(finish), UUIDGen.getClockSeqAndNode()); You have got comparator_type = TimeUUIDType ? ~mck -- The old law

get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
After an upgrade to cassandra-1.0 any get_range_slices gives me: java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93) at

Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
On Mon, 2011-10-31 at 08:00 +0100, Mick Semb Wever wrote: After an upgrade to cassandra-1.0 any get_range_slices gives me: java.lang.OutOfMemoryError: Java heap space at org.apache.cassandra.io.compress.CompressionMetadata.readChunkOffsets(CompressionMetadata.java:93

Re: get_range_slices OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
On Mon, 2011-10-31 at 10:08 +0100, Sylvain Lebresne wrote: I set chunk_length_kb to 16 as my rows are very skinny (typically 100b) I see now this was a bad choice. The read pattern of these rows is always in bulk so the chunk_length could have been much higher so to reduce memory usage

Re: OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
On Mon, 2011-10-31 at 09:07 +0100, Mick Semb Wever wrote: The read pattern of these rows is always in bulk so the chunk_length could have been much higher so to reduce memory usage (my largest sstable is 61G). Isn't CompressionMetadata.readChunkOffsets(..) rather dangerous here? Given a 60G

Re: OOM on CompressionMetadata.readChunkOffsets(..)

2011-10-31 Thread Mick Semb Wever
On Mon, 2011-10-31 at 13:05 +0100, Mick Semb Wever wrote: Given a 60G sstable, even with 64kb chunk_length, to read just that one sstable requires close to 8G free heap memory... Arg, that calculation was a little off... (a long isn't exactly 8K...) But you get my concern... ~mck -- When

Re: Task's map reading more record than CFIF's inputSplitSize

2011-09-07 Thread Mick Semb Wever
3 map tasks (from 4013) is still running after read 25 million rows. Can this be a bug in StorageService.getSplits(..) ? getSplits looks pretty foolproof to me but I guess we'd need to add more debug logging to rule out a bug there for sure. I guess the main alternative would be a bug

Re: RF=1 w/ hadoop jobs

2011-09-05 Thread Mick Semb Wever
On Fri, 2011-09-02 at 09:28 +0200, Patrik Modesto wrote: We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute lexical and thematical data for each page and for exporting the data to a

Re: KeyRange in the CoumnFamilyInputFormat

2011-09-05 Thread Mick Semb Wever
On Mon, 2011-09-05 at 18:18 +0300, Vitaly Vengrov wrote: See these rows in the ColumnFamilyInputFormat.getSplits method : assert jobKeyRange.start_key == null : only start_token supported; assert jobKeyRange.end_key == null : only end_token supported; So, the question

Re: KeyRange in the CoumnFamilyInputFormat

2011-09-05 Thread Mick Semb Wever
On Mon, 2011-09-05 at 19:02 +0200, Mick Semb Wever wrote: ConfigHelper.setInputRange( jobConf, partitioner.getTokenFactory().toString(partitioner.getToken(myKey)), partitioner.getTokenFactory().toString(partitioner.getToken(myKey

Re: RF=1 w/ hadoop jobs

2011-09-05 Thread Mick Semb Wever
On Mon, 2011-09-05 at 21:52 +0200, Patrik Modesto wrote: I'm not sure about 0.8.x and 0.7.9 (to be released today with your patch) but 0.7.8 will fail even with RF1 when there is Hadoop TaskTracer without local Cassandra. So increasing RF is not a solution. This isn't true (or not the

Re: RF=1 w/ hadoop jobs

2011-09-02 Thread Mick Semb Wever
On Fri, 2011-09-02 at 08:20 +0200, Patrik Modesto wrote: As Jonathan already explained himself: ignoring unavailable ranges is a misfeature, imo Generally it's not what one would want i think. But I can see the case when data is to be treated volatile and ignoring unavailable ranges may be

Re: [jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-06-13 Thread Mick Semb Wever
On Sun, 2011-06-12 at 18:53 +, Mick Semb Wever wrote: This issue could stand to be summarized (I still wish we used a mailing list for monsters like this). This i actually really appreciate about the cassandra community. To formulate this: As a newbie here it has allowed me

Re: [jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-06-12 Thread Mick Semb Wever
On Sun, 2011-06-12 at 13:50 +, Eric Evans (JIRA) wrote: Eric Evans commented on CASSANDRA-2474: --- This issue could stand to be summarized (I still wish we used a mailing list for monsters like this). This i actually really appreciate about the

Re: [jira] [Commented] (CASSANDRA-2474) CQL support for compound columns

2011-06-12 Thread Mick Semb Wever
On Sun, 2011-06-12 at 12:10 -0500, Eric Evans wrote: Why not send all Jira changes to a mailing already (like other communities do). We do. I had a quick search and could not find it. But now i see it's part of the commits list. ~mck -- Everything you can imagine is real. Pablo

CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Mick Semb Wever
Just experienced something i don't understand yet. Running a 3 node cluster successfully for a few days now, then one of the nodes went down (server required reboot). After this the other two nodes kept throwing UnavailableExceptions like UnavailableException() at

Re: CL.ONE gives UnavailableException on ok node

2011-04-15 Thread Mick Semb Wever
On Fri, 2011-04-15 at 15:43 -0500, Jonathan Ellis wrote: Sure sounds like you have RF=1 to me. Yes that's right. I see... so the answer here is that i should be using CL.ANY ? (so the write goes through and hinted handoff can get it to the correct node latter on). ~mck -- The fox condemns

Re: map reduce job over indexed range of keys

2011-03-18 Thread Mick Semb Wever
On Thu, 2011-02-24 at 19:45 -0500, Matt Kennedy wrote: Right, so I'm interpreting silence as a confirmation on all points. I opened: https://issues.apache.org/jira/browse/CASSANDRA-2245 https://issues.apache.org/jira/browse/CASSANDRA-2246 I think

Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Mick Semb Wever
On Tue, 2011-01-25 at 09:37 +0100, Patrik Modesto wrote: While developing really simple MR task, I've found that a combiantion of Hadoop optimalization and Cassandra ColumnFamilyRecordWriter queue creates wrong keys to send to batch_mutate(). I've seen similar behaviour (junk rows being

Re: [mapreduce] ColumnFamilyRecordWriter hidden reuse

2011-01-25 Thread Mick Semb Wever
On Tue, 2011-01-25 at 14:16 +0100, Patrik Modesto wrote: The atttached file contains the working version with cloned key in reduce() method. My other aproache was: context.write(ByteBuffer.wrap(key.getBytes(), 0, key.getLength()), Collections.singletonList(getMutation(key))); Which

Re: Cassandra on iSCSI?

2011-01-22 Thread Mick Semb Wever
So if one is forced to use a SAN, how should you set up Cassandra is the interesting question - to me! Here are some thoughts:- 1. Ensure that each node gets dedicated - not shared - LUNs 2. Ensure that these LUNs do share spindles, or nodes will seize to be isolatable (this will be tough

Re: Cassandra on iSCSI?

2011-01-21 Thread Mick Semb Wever
Of course with a SAN you'd want RF=1 since it's replicating internally. Isn't this the same case for raid-5 as well? And we want RF=2 if we need to keep reading while doing rolling restarts? ~mck -- “Anyone who lives within their means suffers from a lack of imagination.” - Oscar Wilde |

Re: Cassandra on iSCSI?

2011-01-21 Thread Mick Semb Wever
[OT] They're quoting roughly the same price for both (claiming that the extra cost goes into having for each node a separate disk cabinet to run local raid-5). You might not need raid-5 for local attached storage. Yes we did ask. But raid-5 is the

Cassandra on iSCSI?

2011-01-20 Thread Mick Semb Wever
Does anyone have any experiences with Cassandra on iSCSI? I'm currently testing a (soon-to-be) production server using both local raid-5 and iSCSI disks. Our hosting provider is pushing us hard towards the iSCSI disks because it is easier for them to run (and to meet our needs for increasing disk

Re: Cassandra on iSCSI?

2011-01-20 Thread Mick Semb Wever
It should work fine; the main reason to go with local storage is the huge cost advantage. [OT] They're quoting roughly the same price for both (claiming that the extra cost goes into having for each node a separate disk cabinet to run local raid-5). *I just committed a README for