a lot of memory for this
single operation, right?
What if the memory needed for this operation is bigger than it fits in java
heap? Would this be a problem to Cassandra?
Best regards,
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
Hi,
This question is just for curiosity purposes, I don't need this in my
solution, but it's something I was asking myself...
Is there a way of indexing the partition key values in Cassandra? Does
anyone needed this and found a solution of any kind?
For example, I know the sample bellow doesn't
on
secondary index translates into full cluster scan...
Regards
Duy Hai DOAN
On Thu, Sep 18, 2014 at 5:44 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Hi,
This question is just for curiosity purposes, I don't need this in my
solution, but it's something I was asking myself
, simply
have a single limit, without the modes and default it to 10 or 25 or some
other relatively low number for “normal” apps.
This would be more developer-friendly, for both new and “normal”
developers... I think.
-- Jack Krupansky
*From:* Marcelo Elias Del Valle marc...@s1mbi0se.com.br
Hi,
I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having too
many open files exceptions when I try to perform a large number of
operations in my 10 node cluster.
I saw the documentation
?
Best regards,
Marcelo.
2014-08-08 17:06 GMT-03:00 Shane Hansen shanemhan...@gmail.com:
Are you using apache or Datastax cassandra?
The datastax distribution ups the file handle limit to 10. That
number's hard to exceed.
On Fri, Aug 8, 2014 at 1:35 PM, Marcelo Elias Del Valle
marc
...@spinn3r.com:
You may want to look at the the actual filenames. You might have an app
leaving them open. Also, remember, sockets use FDs so they are in the list
too.
On Fri, Aug 8, 2014 at 1:13 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
I am using datastax community
. So, run netstat and check the number of established
connections. This number should not be big.
Thank you,
Andrey
On Fri, Aug 8, 2014 at 12:35 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Hi,
I am using Cassandra 2.0.9 running on Debian Wheezy, and I am having
too
Hi,
I have the need to executing a map/reduce job to identity data stored in
Cassandra before indexing this data to Elastic Search.
I have already used ColumnFamilyInputFormat (before start using CQL) to
write hadoop jobs to do that, but I use to have a lot of troubles to
perform tunning, as
:)
Jon
On Mon, Jul 21, 2014 at 8:24 AM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Hi,
I have the need to executing a map/reduce job to identity data stored in
Cassandra before indexing this data to Elastic Search.
I have already used ColumnFamilyInputFormat (before start
.
2014-07-21 14:06 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
I haven't tried pyspark yet, but it's part of the distribution. My
main language is Python too, so I intend on getting deep into it.
On Mon, Jul 21, 2014 at 9:38 AM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Hi
Hi Robert,
First of all, thanks for answering.
2014-07-21 20:18 GMT-03:00 Robert Coli rc...@eventbrite.com:
You're wrong, unless you're talking about insertion into a memtable, which
you probably aren't and which probably doesn't actually work that way
enough to be meaningful.
On disk,
, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Although several sstables (disk fragments) may have the same row key,
inside a single sstable row keys and column keys are indexed, right?
Otherwise, doing a GET in Cassandra would take some time.
From the M/R perspective, I was reffering
, DuyHai Doan doanduy...@gmail.com
wrote:
I think some figures from nodetool tpstats and nodetool
compactionstats may help seeing clearer
And Pavel, when you said batch, did you mean LOGGED batch or UNLOGGED
batch ?
On Fri, Jun 20, 2014 at 8:02 PM, Marcelo Elias Del Valle
marc
async.
The native driver reuses connections and intelligently manages the
pool for you. It can also multiplex queries over a single connection.
I am assuming you're using one of the datastax drivers for CQL, btw.
Jon
On Thu, Jun 19, 2014 at 7:37 PM, Marcelo Elias Del Valle
marc
. Version 2.0.8.
It happened when I did many writes (no reads). Writes are done in small
batches of 2 inserts (writing to 2 column families). The values are big
blobs (up to 100Kb).
Any clues?
Pavel
On Thu, Jun 19, 2014 at 8:07 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote
environment.
ml
On Fri, Jun 20, 2014 at 3:29 AM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Yes, I am using the CQL datastax drivers.
It was a good advice, thanks a lot Janathan.
[]s
2014-06-20 0:28 GMT-03:00 Jonathan Haddad j...@jonhaddad.com:
The only case in which it might
100-200 writes each up
to 100Kb every 15[s].
It is running on decent cluster of 5 identical nodes, quad cores i7 with
32Gb RAM and 480Gb SSD.
Regards,
Pavel
On Fri, Jun 20, 2014 at 12:31 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Pavel,
In my case, the heap
This is nice!
I was looking for something like this to implement a multi DC cluster
between OVh and Amazon.
Thanks for sharing!
[]s
2014-06-20 15:35 GMT-03:00 Jeremy Jongsma jer...@barchart.com:
Sharing in case anyone else wants to use this:
implementation in your driver.
Astyanax will keep N connections open to each node (configurable) and route
each query in a separate message over an existing connection, waiting until
one becomes available if all are in use.
On Fri, Jun 20, 2014 at 12:32 PM, Marcelo Elias Del Valle
marc
nodes as well.
Regards,
Pavel
On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
I have a 10 node cluster with cassandra 2.0.8.
I am taking this exceptions in the log when I run my code. What my code
does is just reading data from a CF
at 11:13 AM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
I have a 10 node cluster with cassandra 2.0.8.
I am taking this exceptions in the log when I run my code. What my code
does is just reading data from a CF and in some cases it writes new data.
WARN [Native-Transport
I was taking a look at Cassandra anti-patterns list:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html
Among then is
SELECT ... IN or index lookups¶
straightforward w/ the java or python drivers.
On Thu, Jun 19, 2014 at 5:56 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
I was taking a look at Cassandra anti-patterns list:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture
with the data.
On Thu, Jun 19, 2014 at 6:11 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
But using async queries wouldn't be even worse than using SELECT IN?
The justification in the docs is I could query many nodes, but I would
still
do it.
Today, I use both async queries
Wouldn't be better to use nodetool clearsnapshot?
[]s
2014-06-14 17:38 GMT-03:00 S C as...@outlook.com:
I am thinking of rm file.db once the backup is complete. Any special
cases to be careful about?
-Kumar
--
Date: Sat, 14 Jun 2014 13:13:10 -0700
Subject: Re:
I have a 10 node cluster with cassandra 2.0.8.
I am taking this exceptions in the log when I run my code. What my code
does is just reading data from a CF and in some cases it writes new data.
WARN [Native-Transport-Requests:553] 2014-06-18 11:04:51,391
BatchStatement.java (line 228) Batch of
AFAIK, when you run a repair a snapshot is created.
After the repair, I run nodetool clearsnapshot to save disk space.
Not sure it's you case or not.
[]s
2014-06-18 13:10 GMT-03:00 Brian Tarbox tar...@cabotresearch.com:
We do a repair -pr on each node once a week on a rolling basis.
Should we
Is replication_factor your DC name?
Here is what I would using:
CREATE KEYSPACE IF NOT EXISTS animals
WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy',
'DC1' : 3 };
But in my case, I am using GossipPropertyFileSnitch and DC1 is
configured there, so Cassandra knows which nodes are in
, Michael
michael.la...@nytimes.com wrote:
OK Marcelo, I'll work on it today. -ml
On Tue, Jun 3, 2014 at 8:24 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Hi Michael,
For sure I would be interested in this program!
I am new both to python and for cql. I started creating
Actually, I have the same doubt. The same happens to me, but I guess it's
because of lack of knowledge in Cassandra vnodes, somehow...
I just added 3 nodes to my old 2 nodes cluster, now I have a 5 nodes
cluster.
As rows should be in a node calculated by HASH / number of nodes, adding a
new node
on
Cassandra side and not the client side timeout. Even if you set the client
side timeouts, the C* read write timeouts are still respected on that
side.
On Mon, Jun 2, 2014 at 10:55 AM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
I am using Cassandra 2.0 with python CQL.
I have
Hi,
I have some cql CFs in a 2 node Cassandra 2.0.8 cluster.
I realized I created my column family with the wrong partition. Instead of:
CREATE TABLE IF NOT EXISTS entity_lookup (
name varchar,
value varchar,
entity_id uuid,
PRIMARY KEY ((name, value), entity_id))
WITH
caching=all;
Hi,
Has anyone used HIVE + Cassandra Community successfully? I am having
problems mapping the keyspace, but I started wondering if only DSE has
support for it.
I am trying to use HIVE 0.13 to access cassandra 2.0.8 column families
created with CQL3.
Here is how I created my column families:
/alter-cassandra-column-family-primary-key-using-cassandra-cli-or-cql
Cheers,
Jens
On Mon, Jun 2, 2014 at 7:48 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Hi,
I have some cql CFs in a 2 node Cassandra 2.0.8 cluster.
I realized I created my column family with the wrong
Hello everyone,
I was using astyanax connection pool defined as this:
ipSeeds = LOAD_BALANCER_HOST:9160;
conPool.setSeeds(ipSeeds)
.setDiscoveryType(NodeDiscoveryType.TOKEN_AWARE)
.setConnectionPoolType(ConnectionPoolType.TOKEN_AWARE);
However, my cluster have 4 nodes and I have 8 client
Hello everyone,
I have a cassandra cluster running at amazon. I am trying to add a new
datacenter for this cluster now, outside AWS. I know I could use
multiregion, but I would like to be vendor free in terms of cloud.
Reading the article
Hello everyone,
I am trying to create backups of my data on AWS. My goal is to store
the backups on S3 or glacier, as it's cheap to store this kind of data. So,
if I have a cluster with N nodes, I would like to copy data from all N
nodes to S3 and be able to restore later. I know Priam does
the cluster, I am not sure it would help.
So here is my question: any ideas of how to change my model to be able
to query several inputs at a time, consuming less network IO? I am guessing
there must be a way of optimizing it...
Best regards,
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
?
Best regards,
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
thrift service to ip-10-0-0-146.ec2.internal/
10.0.0.146:9160http://10.0.0.146:9160
What is the log file look like on 10.0.0.111?
Thanks,
Dean
From: Marcelo Elias Del Valle mvall...@gmail.commailto:
mvall...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user
of
hadoop at some point but I wonder if test suites suck in 0.20.2 because
the pom file points to that versionŠ.depends on if they actually have
tests for map/reduce which is probably a bit hard.
Dean
From: Marcelo Elias Del Valle
mvall...@gmail.commailto:mvall...@gmail.com
Reply-To: user
map/reduce with murmur partitioner?
Dean
From: Marcelo Elias Del Valle mvall...@gmail.commailto:
mvall...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, July 22, 2013 4:04 PM
To: user
for the task
Any hint?
Best regards,
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
backups for
Cassandra and could run in any data center?
Best regards,
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
on first node and ip1,ip1 on second node.
Any idea why? It's probably what is causing cassandra to die, right?
2013/2/27 Marcelo Elias Del Valle mvall...@gmail.com
Hello Ben, Thanks for the willingness to help,
2013/2/27 Ben Bromhead b...@instaclustr.com
Have your added the priam java agent
Hello All,
I’m sending this email because I think it may be interesting for Cassandra
users, as this project have a strong usage of Cassandra platform.
We are strongly considering opening the source of our DMP (Data Management
Platform), if it proves to be technically interesting to other
that by default only one
ephemeral drive is attached and you must specify all ephemeral drives
that you want to use at launch time. Also, you can create a RAID 0 of
all local disks to provide maximum speed and space.
On 16 January 2013 20:42, Marcelo Elias Del Valle mvall...@gmail.com
wrote
Elias Del Valle
http://mvalle.com - @mvallebr
to Cassandra's storage when running it at
Amazon AWS?
Any thoughts would be highly appreciatted.
Best regards,
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
to local EBS - remap EBS to another box -
sstable2json over new sstables - S3 (splitting into ~100MB parts), then
use EMR to consume the JSON part files.
will
On Wed, Jan 16, 2013 at 3:30 PM, Marcelo Elias Del Valle
mvall...@gmail.com wrote:
William,
I just saw your message today
, at least I can evolute now! :D
2012/11/7 Tristan Seligmann mithra...@mithrandi.net
On Wed, Nov 7, 2012 at 9:46 PM, Marcelo Elias Del Valle
mvall...@gmail.com wrote:
Service killed by signal 9
Signal 9 is SIGKILL. Assuming that you're not killing the process
yourself, I guess the most likely
Answering myself: it seems we can't have any non type 1 UUIDs in column
names. I used the UTF8 comparator and saved my UUIDs as strings, it worked.
2012/10/29 Marcelo Elias Del Valle mvall...@gmail.com
Hello,
I am using ColumnFamilyInputFormat the same way it's described in this
example
://johannburkard.de/software/uuid/
Thanks,
Dean
From: Marcelo Elias Del Valle mvall...@gmail.commailto:
mvall...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, October 29, 2012 1:17 PM
(em nossa implementação feita )
portanto acho válido a solução que você encontrou (sorry por não ter
enxergado o probs antes caso, seja este o seu caso ...)
Abs,
André
2012/10/29 Marcelo Elias Del Valle mvall...@gmail.com
Answering myself: it seems we can't have any non type 1 UUIDs
the node tool with my own aggregate calls if needed
to sum up multiple column families and such).
Thanks,
Dean
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
want to query for streams that match criteria AND which returns a CF name
and they query that CF name so we almost need a query with variables like
select cfName from Meta where x = y and then select * from cfName where
x. Which we can do today.
Dean
From: Marcelo Elias Del Valle mvall
of one).
Later,
Dean
From: Marcelo Elias Del Valle mvall...@gmail.commailto:
mvall...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Monday, September 24, 2012 1:54 PM
To: user
to each of the nodes? Or would I query 1 node and it would
communicate to others?
Later,
Dean
From: Marcelo Elias Del Valle mvall...@gmail.commailto:
mvall...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
CF?
Thanks a lot for the answers
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
on this may or may not be interested so it creates less noise on this
list too.
Later,
Dean
From: Marcelo Elias Del Valle mvall...@gmail.commailto:
mvall...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user
of showing Hex, it shows the real values by
translating the bytes to String for the schema portions that it is aware of
that is.
Later,
Dean
From: Marcelo Elias Del Valle mvall...@gmail.commailto:
mvall...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user
id?
Best regards,
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
internally if I do a multiget... Is this expensive
in terms of performance and latency?
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
hiring their
support at any time.
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
Your more than welcome to fork that and make it work with 1.1 :)
DSE != (Cassandra + Brisk)
From: Marcelo Elias Del Valle mvall...@gmail.commailto:
mvall...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user
community version.
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
-enterprise there.
Any ways as far as i am concern it should not be problematic thing.
Regards,
Abhijit
On Thu, Sep 20, 2012 at 12:07 AM, Marcelo Elias Del Valle
mvall...@gmail.com wrote:
Not sure if this question should be asked in this list, if this is the
wrong place to ask this, please
a new insert for the user.
It would be a data replication, but I would have no read-before-write and I
am guessing the second query would perform faster.
Any thoughts?
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
own id, right? But why? Wouldn't it be faster to have a composite
key in the requestCF itself?
From: Marcelo Elias Del Valle mvall...@gmail.commailto:mvall...@gmail.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user
in
my requestCF.
--
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr
parts you are comfortable with. Same for the
questions about HDFS etc. Start with the smallest about of infrastructure.
Hope that helps.
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 18/09/2012, at 10:28 AM, Marcelo Elias Del Valle mvall
% of the time used for
indexing. I draw off playOrm examples a lot but one table may be
partitioned by time so each month of data is in a partition, you can then
have indexes on each partition allowing you to do quick queries into
partitions.
Later,
Dean
From: Marcelo Elias Del Valle mvall
are comfortable with. Same for the
questions about HDFS etc. Start with the smallest about of infrastructure.
Hope that helps.
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 18/09/2012, at 10:28 AM, Marcelo Elias Del Valle mvall...@gmail.com
should use Cassandra instead of
HBase?
I am sorry if the questions are too dummy, I have been watching a lot
of videos and reading a lot of documentation about Cassandra, but honestly,
more I read more I have questions.
Thanks in advance.
Best regards,
--
Marcelo Elias Del Valle
http
75 matches
Mail list logo