Once you have created the CF from cqlsh, switch over to cassandra-cli
and run describe schema. It will show you the schema for all your
column families in syntax that can be passed back into cassandra-cli
to create them.
The cassandr-cli syntax that you are looking for is probably the and
Yes. Any fuse filesystem is going to be substantially slower than a massive
one like ext4.
-Tupshin
On Oct 30, 2012 2:09 PM, Brian Tarbox tar...@cabotresearch.com wrote:
I got some new ubuntu servers to add to my cluster and found that the file
system is fuseblk which really means NTFS.
All
What consistency level are you writing with? If you were writing with ANY,
try writing with a higher consistency level.
-Tupshin
On Nov 18, 2012 9:05 PM, Chuan-Heng Hsiao hsiao.chuanh...@gmail.com
wrote:
Hi Aaron,
Thank you very much for the replying.
The 700 CFs were created in the
Rules that apply:
2 - guaranteed access
3 - treatment of nulls (though different than an rdbms due to the inherent
sparse nature of rows)
4 - online catalog (not really true until Cassandra 1.2 and CQL 3
5 - comprehensive data sub language (only if you remove the word relational)
6 - view updating
Unless I'm misreading the git history, the stack trace you referenced isn't
from 1.1.2. In particular, the writeHintForMutation method in
StorageProxy.java wasn't added to the codebase until September 9th (
What is in your Cassandra log right before and after that freeze?
-Tupshin
On Mar 20, 2013 8:06 AM, Joel Samuelsson samuelsson.j...@gmail.com
wrote:
Hello,
I've been trying to load test a one node cassandra cluster. When I add
lots of data, the Cassandra node freezes for 4-5 minutes during
Speaking from practical experience, it is possible to simulate this feature
by retrieving a slice of your row that only contains the most recent 100
items. You can then prevent the rows from growing out of control by checking
the size of the row and pruning it back to 100 every N writes, where N
Any chance your server has been running for the last two weeks with the
leap second bug?
http://www.datastax.com/dev/blog/linux-cassandra-and-saturdays-leap-second-problem
-Tupshin
On Jul 12, 2012 1:43 PM, Leonid Ilyevsky lilyev...@mooncapital.com
wrote:
I am loading a large set of data into a
Generate a timeuuid for each post based on the original timestamp.
-Tupshin
On May 29, 2010 7:50 PM, Erik eriko...@gmail.com wrote:
Hi,
I have a list of posts I'm trying to insert into Cassandra. Each post has a
timestamp already (in the past) that is not necessarily unique. I'm trying
to
There is potentially a DSE specific issue that you are running into and you
should probably contact Datastax support to confirm. Also, keep in mind
that Cassandra does recycle it's commitlog files instead of deleting and
recreating them, so you shouldn't expect them to disappear even when the
node
It's conceivable that one of the faster USB 3.0 sticks would be sufficient
for this. I wouldn't exactly call it an enterprise configuration, but
it's worth considering. Keep in mind that if you are comfortable using your
RF for durability, you can turn off durable_writes on your keyspace and not
Increasing the phi value to 12 can be a partial workaround. It's certainly
not a fix, but it does partially alleviate the issue. Otherwise hang in
there until 1.2.12. Aaron is probably right that this is aggravated on
under powered nodes, but larger nodes can still see these symptoms.
-Tupshin
On
No. This is not going to work. The vnodes feature requires the murmur3
partitioner which was introduced with Cassandra 1.2.
Since you are currently using 1.1, you must be using the random
partitioner, which is not compatible with vnodes.
Because the partitioner determines the physical layout
/12/30 Tupshin Harper tups...@tupshin.com
No. This is not going to work. The vnodes feature requires the
murmur3 partitioner which was introduced with Cassandra 1.2.
Since you are currently using 1.1, you must be using the random
partitioner, which is not compatible with vnodes
OK. Given the correction of my unfortunate partitioner error, you can, and
probably should, upgrade in place to 1.2, but with num_tokens=1 so it will
initially behave like 1.1 non vnodes would. Then you can do a rolling
conversion to more than one vnode per node, and once complete, shuffle your
This is a generally good interpretation of the state of vnodes with respect
to Cassandra versions 1.2.12 and 1.2.13.
Adding a new datacenter to a 1.2.12 cluster at your scale should be fine. I
consider vnodes fit for production at almost any scale after 1.2.13, or 50
nodes or less (ballpark) for
That is a fine option and can make perfect sense if you have keyspaces with
very different runtime characteristics.
-Tupshin
On Jan 7, 2014 7:30 AM, Robert Wille rwi...@fold3.com wrote:
I’d like to have my keyspaces on different volumes, so that some can be on
SSD and others on spinning disk.
Yes this is pretty close to the ultimate anti-pattern in Cassandra.
Whenever possible, we encourage models where your updates are idempotent,
and not dependent on a read before write. Manoj is looking for what is
essentially strong ordering in a distributed system, which always has
inherent
It is bad because of the risk of concurrent modifications. If you don't
have some kind of global lock on the document/row, then 2 readers might
read version A, reader 1 writes version B based on A, and reader 2 writes
version C based on A, overwriting the changes in B. This is *inherent* to
the
are sequential and from
the same thread and with Consistency ALL,
the write should not return until all replicas have committed. So I am
expecting all replicas to have the same value, when the next read happens.
Not true ??
regards
On Fri, Jan 10, 2014 at 2:51 PM, Tupshin Harper tups
This should be the doc you are looking for.
http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/operations/ops_add_dc_to_cluster_t.html
-Tupshin
On Jan 21, 2014 2:14 AM, Lu, Boying boying...@emc.com wrote:
Hi, All,
I’m new to Cassandra. I want to know how to
One CQL row per user, keyed off of the UUID.
Another table keyed off of email, with another column containing the UUID
for lookups in the first table. Only registration will require a
lightweight transaction, and only for the purpose of avoiding duplicate
email registration race conditions.
timeuuid,
PRIMARY KEY (email, id)
);
And during registration, I would just use LWT on the user_email_index
table first and insert the record and then insert the actual user record
into user table w/o LWT. Does that sound right to you?
- Drew
On Jan 21, 2014, at 10:01 AM, Tupshin Harper
This is a known issue until Cassandra 2.1
https://issues.apache.org/jira/browse/CASSANDRA-5202
-Tupshin
On Feb 6, 2014 10:05 PM, Robert Coli rc...@eventbrite.com wrote:
On Thu, Feb 6, 2014 at 8:39 AM, Ondřej Černoš cern...@gmail.com wrote:
Update: I dropped the keyspace, the system keyspace,
While, historically, it has been true that queuing in Cassandra has been an
anti-pattern, it is also true that Leveled Compaction addresses the worst
aspect of frequent deletes in Cassandra, and that overall, queuing in
Cassandra is nowhere near the anti-pattern that it used to be. This is
#5633 was actually closed because the static columns feature (
https://issues.apache.org/jira/browse/CASSANDRA-6561) which has been
checked in to the 2.0 branch but is not yet part of a release (it will be
in 2.0.6).
That feature will let you update multiple rows within a single partition by
You can use OpsCenter in production with DSC/Apache Cassandra clusters.
Some features are only enabled with DSE, but the rest work fine with DSC.
-Tupshin
On Feb 22, 2014 11:20 PM, user 01 user...@gmail.com wrote:
I would be using nodetool JConsole for monitoring. Though it would
be less
under the hood would be
a compare-and-(batch)-set on a single wide row, so it maybe is
possible with the Thrift API (I have to check).
Thanks again!
Best regards,
Clint
On Sat, Feb 22, 2014 at 11:38 AM, Tupshin Harper tups...@tupshin.com
wrote:
#5633 was actually closed because the static
at 5:32 PM, Tupshin Harper tups...@tupshin.com wrote:
Hi Clint,
That does appear to be an omission in CQL3. It would be possible to
simulate it by doing
BEGIN BATCH
UPDATE foo SET z = 10 WHERE x = 'a' AND y = 1 IF t= 2 AND z=10;
UPDATE foo SET t = 5,z=6 where x = 'a' AND y = 4
APPLY
regards,
Clint
On Mon, Feb 24, 2014 at 2:32 PM, Tupshin Harper tups...@tupshin.comwrote:
Hi Clint,
That does appear to be an omission in CQL3. It would be possible to
simulate it by doing
BEGIN BATCH
UPDATE foo SET z = 10 WHERE x = 'a' AND y = 1 IF t= 2 AND z=10;
UPDATE foo SET t
Hi Clint,
What you are describing could actually be accomplished with the Thrift API
and a multiget_slice with a slicerange having a count of 1. Initially I was
thinking that this was an important feature gap between Thrift and CQL, and
was going to suggest that it should be implemented (possible
families in nice sized batches
SELECT family FROM id WHERE key=0;
and then do the fan-out selects that I described previously.
-Tupshin
On Tue, Feb 25, 2014 at 10:15 PM, Tupshin Harper tups...@tupshin.comwrote:
Hi Clint,
What you are describing could actually be accomplished with the Thrift API
And one last clarification. Where I said stored procedure earlier, I
meant prepared statement. Sorry for the confusion. Too much typing while
tired.
-Tupshin
On Tue, Feb 25, 2014 at 10:36 PM, Tupshin Harper tups...@tupshin.comwrote:
I failed to address the matter of not knowing the families
This is a known issue that is fixed in 2.1beta1.
https://issues.apache.org/jira/browse/CASSANDRA-5202
Until 2.1, we do not recommend relying on the recycling of tables through
drop/create or truncate.
However, on a single node cluster, I suspect that truncate will work far
more reliably than
If you can programmatically roll over onto a new column family every 6
hours (or every day or other reasonable increment), and then just drop your
existing column family after all the columns would have been expired, you
could skip your compaction entirely. It was not clear to me from your
not seem to be perfect solution.
On Thu, Feb 27, 2014 at 4:49 PM, Tupshin Harper tups...@tupshin.comwrote:
If you can programmatically roll over onto a new column family every 6
hours (or every day or other reasonable increment), and then just drop your
existing column family after all the columns
For the first question, try select * from system.peers
http://www.datastax.com/documentation/cql/cql_using/use_query_system_c.html?pagename=docsversion=1.2file=cql_cli/using/query_system_tables
For the second, there is a JMX and nodetool command, but I'm not aware of
any way to get it directly
level is greater than 1).
Would you mind clarifying? Thanks a lot!
Best regards,
Clint
On Wed, Feb 26, 2014 at 4:56 AM, Tupshin Harper tups...@tupshin.comwrote:
And one last clarification. Where I said stored procedure earlier, I
meant prepared statement. Sorry for the confusion. Too
The complete rewrite of counters in 2.1(which should address the counter
accuracy issues) will still have this limitation. Deleting and recreating
counters is not supported and will continue to not be supported.
-Tupshin
On Mar 1, 2014 5:13 PM, Manoj Khangaonkar khangaon...@gmail.com wrote:
20, easily. Probably far more, but I lack data points beyond that.
-Tupshin
On Mar 9, 2014 10:26 AM, Lu, Boying boying...@emc.com wrote:
Hi, experts,
Since the Cassandra 2.x supports DB that across multiple DCs, my question
is how many DCs can Cassandra support in practice?
Thanks
Take a 3 node cluster with RF=3, and QUORUM reads and writes. Consistency
is achieved by ensuring that at least two nodes acknowledge a write, and at
least two nodes have to participate in a read. As a result, you know that
at least one of the two nodes that you are reading from has received the
.
Wayne
On Mar 10, 2014, at 3:52 PM, Tupshin Harper tups...@tupshin.com wrote:
If you really need to rely on this behavior, you should probably do the
whole write as a lightweight transaction, despite the additional overhead.
And to be clear, and to elaborate, null is the default state for a
Cassandra cell if you don't write to it, so you can always create a row
with a null column by writing the row without that column being specified.
Additionally, cql's delete statement optionally takes a columns argument,
so if you
:37 PM, Tupshin Harper tups...@tupshin.com wrote:
Oh sorry, I misunderstood. But now I'm confused about how what you are
trying to do is not accomplished with the existing IF NOT EXISTS syntax.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_ltwt_transaction_c.html
I agree that we are way off the initial topic, but I think we are spot on
the most important topic. As seen in various tickets, including #6704 (wide
row scanners), #6167 (end-slice termination predicate), the existence
of intravert-ug (Cassandra interface to intravert), and a number of others,
Peter,
I didn't specifically call it out, but the interface I just proposed in my
last email would be very much with the goal of make writing complex
queries less painful and more efficient. by providing a deep integration
mechanism to host that code. It's very much a enough rope to hang
with a co-worker and going over the pros/cons of various
approaches to realizing the goal. I'm still digging into Presto. I saw some
people are working on support for cassandra in presto.
On Wed, Mar 12, 2014 at 12:15 PM, Tupshin Harper tups...@tupshin.comwrote:
Peter,
I didn't specifically
It's the difference between reading from only the partitions that you are
interested, vs reading every single partition before filtering the
results. At scale, and assuming you don't actually need to read every
partition, there would be a huge difference.
If the model requires you to read every
Read the automatic paging portion of this post :
http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0
On Mar 17, 2014 8:09 PM, Philip G g...@gpcentre.net wrote:
On Mon, Mar 17, 2014 at 4:54 PM, Robert Coli rc...@eventbrite.com wrote:
The form of your question suggests you
Your us-east datacenter, has RF=2, and 2 racks, which is the right way
to do it (I would rarely recommend using a different number of racks
than your RF). But by having three nodes on one rack (1b) and only one
on the other(1a), you are telling Cassandra to distribute the data so
that no two
More details would be helpful (exact schema), method of inserting data,
etc) but you can try just doing dropping the indices and recreate them
after the import is finished.
-Tupshin
On Apr 7, 2014 8:53 AM, Fasika Daksa cassandra.d...@gmail.com wrote:
We are running different workload test on
=3 ? That would make sense,
wouldn't it...
That's what I'll do for production.
Oleg
On 2014-04-07 12:23:51 +, Tupshin Harper said:
Your us-east datacenter, has RF=2, and 2 racks, which is the right way
to do it (I would rarely recommend using a different number of racks
than your RF
Constant deletes and rewrites are a very poor pattern to use with
Cassandra. It would be better to write to a new row and partition every
minute and use a TTL to auto expire the old data.
-Tupshin
On Apr 6, 2014 2:55 PM, Yulian Oifa oifa.yul...@gmail.com wrote:
Hello
I am having raw in which
I do not agree with this advice. It can be perfectly reasonable to have
#nodes 2*RF.
It is common to deploy a 3 node cluster with RF=3 and it works fine as long
as each node can handle 100% of your data, and keep up with the workload.
-Tupshin
On Apr 14, 2014 5:25 AM, Markus Jais
but eventually hardware will fail.
Markus
Tupshin Harper tups...@tupshin.com schrieb am 13:44 Montag, 14.April
2014:
I do not agree with this advice. It can be perfectly reasonable to have
#nodes 2*RF.
It is common to deploy a 3 node cluster with RF=3 and it works fine as
long as each
tl;dr make sure you have enough capacity in the event of node failure. For
light workloads, that can be fulfilled with nodes=rf.
-Tupshin
On Apr 14, 2014 2:35 PM, Robert Coli rc...@eventbrite.com wrote:
On Mon, Apr 14, 2014 at 2:25 AM, Markus Jais markus.j...@yahoo.de wrote:
It is generally
It is not common, but I know of multiple organizations running with RF=5,
in at least one DC, for HA reasons.
-Tupshin
On Apr 15, 2014 2:36 PM, Robert Coli rc...@eventbrite.com wrote:
On Tue, Apr 15, 2014 at 6:14 AM, Ken Hancock ken.hanc...@schange.comwrote:
Keep in mind if you lose the
Please provide your keyspace definition, and the output of nodetool ring
-Tupshin
On Apr 15, 2014 3:52 PM, Vivek Mishra mishra.v...@gmail.com wrote:
Hi,
I am trying Cassandra light weight transaction support with Cassandra 2.0.4
cqlsh:twitter create table user(user_id text primary key,
On Wed, Apr 16, 2014 at 1:27 AM, Tupshin Harper tups...@tupshin.comwrote:
Please provide your keyspace definition, and the output of nodetool
ring
-Tupshin
On Apr 15, 2014 3:52 PM, Vivek Mishra mishra.v...@gmail.com wrote:
Hi,
I am trying Cassandra light weight transaction support
,
Mine is a simple case. Running on single node only. Keyspace is:
create keyspace twitter with replication = {'class':'SimpleStrategy',
'replication_factor' : 3}
-Vivek
On Wed, Apr 16, 2014 at 1:27 AM, Tupshin Harper tups...@tupshin.comwrote:
Please provide your keyspace definition
an
Exception, even if some replica nodes are not accessible.
On Wed, Apr 16, 2014 at 2:00 PM, Tupshin Harper tups...@tupshin.comwrote:
No, but you do need a quorum of nodes.
http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html
SERIAL
A write must
It's often an excellent strategy. No known issues.
-Tupshin
On May 16, 2014 4:13 PM, Anand Somani meatfor...@gmail.com wrote:
Hi,
It seems like it should be possible to have a keyspace replicated only to
a subset of DC's on a given cluster spanning across multiple DCs? Is there
anything
Pull requests encouraged. :)
-Tupshin
On May 17, 2014 7:43 PM, Kevin Burton bur...@spinn3r.com wrote:
AH… looks like there's one in the Datastax java driver. Looks like it
doesn't support everything but probably supports the features I need ;)
So I'll just use that!
On Sat, May 17, 2014
While Astyanax 2.0 is still beta, I think you will find it provides a very
good migration path from the 1.0 thrift based version to the 2.0 native
driver version. Well worth considering if you like the Astyanax API and
functionality. I know of multiple DataStax customers planning on using it.
When one node or DC is down, coordinator nodes being written through will
notice this fact and store hints (hinted handoff is the mechanism), and
those hints are used to send the data that was not able to be replicated
initially.
http://www.datastax.com/dev/blog/modern-hinted-handoff
-Tupshin
For performance reasons, you shouldn't enable vnodes on any Cassandra/DSE
datacenter that is doing hadoop analytics workloads. Other DCs in the
cluster can use vnodes.
-Tupshin
On Jul 2, 2014 5:50 PM, Clint Kelly clint.ke...@gmail.com wrote:
Hi everyone,
Apologies if this is the incorrect
...@gmail.com wrote:
Hi Tupshin,
Thanks for the quick reply. Is the performance concern from the
Hadoop integration needing to set up separate SELECT operations for
all of the unique vnode ranges?
Best regards,
Clint
On Wed, Jul 2, 2014 at 6:00 PM, Tupshin Harper tups
I've seen a lot of deployments, and I think you captured the scenarios and
reasoning quite well. You can apply other nuances and details to #2 (e.g.
segment based on SLA or topology), but I agree with all of your reasoning.
-Tupshin
-Global Field Strategy
-Datastax
On Jul 8, 2014 10:54 AM, Jeremy
68 matches
Mail list logo