[openstack-dev] Fwd: [oslo][mistral] Saga of process than ack and where can we go from here...
Hi Joshua. I think than Mistral have already fast solution - they customised oslo.messaging rpc to achieve ack-after-process in Mistral code base About solution in oslo.messaging code base… I plan to write spec for new oslo.messaging driver interface soon as was agreed during design session (we need transport specific interface, not user API specific as we have now) Also we could start work on new User API need by Mistral meanwhile. > Begin forwarded message: > > From: Joshua Harlow> Subject: [openstack-dev] [oslo][mistral] Saga of process than ack and where > can we go from here... > Date: May 4, 2016 at 12:24:13 AM GMT+3 > To: "OpenStack Development Mailing List (not for usage questions)" > > Reply-To: "OpenStack Development Mailing List \(not for usage questions\)" > > > Howdy folks, > > So I meet up with *some* of the mistral folks during friday last week at the > summit and I was wondering if we as a group can find a path to help that > project move forward in their desire to have some kind of process than ack > (vs the existing ack then process) in there usage of the messaging layer. > > I got to learn that the following exists in mistral (sad-face): > > https://github.com/openstack/mistral/blob/master/mistral/engine/rpc.py#L38 > > And it got me thinking about how/if we can as a group possibly allow a > variant of https://review.openstack.org/#/c/229186/ to get worked on and > merged in and release so that the above 'hack' can be removed. > > I also would like to come to some kind of understanding that we also (mistral > folks would hopefully help here) would remove this kind of change in the > future as the longer term goal (of something like > https://review.openstack.org/#/c/260246/) would progress. > > Thoughts from folks (mistral and oslo)? > > Anyway we can create a solution that works in the short term (allowing for > that hack to be removed) and working toward the longer term goal? > > -Josh > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [oslo][messaging][pika] Pika driver development status
Hi stackers! In Mitaka new oslo.messaging driver for RabbitMQ is released (pika driver). I would like to share information about current state of pika driver and plans for improvements during next release cycle Now we have driver implementation which: 1) passes all tempest tests 2) implements heartbeat mechanism properly (uses pika implementation instead of implementing by own, like kombu driver does) 3) implements fast detection of connection problem (tcp_user_timeout usage for sending, heartbeats for listening) and reconnection to next available connection node (works better with RabbitMQ cluster then kombu) 4) implements pure at-most-once message processing for RPC if retry=0 is set for message sending (kombu driver does not guarantee at-most-once processing even with retry=0 because uses acknowledges) 5) is about 50% faster then kombu (at least in my simple test with simulator.py - 1 rpc server process and 1 rpc client process, each client runs 5 threads): results for rabbit: 330.2 msg/sec results for pika: 497.6 msg/sec 6) eats RabbitMQ a bit more then kombu (especially mandatory flag for rpc to fail fast if nobody listen target). Therefore in performance testing (17 rpc server processes and 17 rpc client processes, each client runs 5 threads), when RabbitMQ cluster is overloaded, pika driver works about 10% slower in rpc call. My results: results for rabbit: 3097 msg/sec results for pika: 2825 msg/sec but casts work faster about 17% then kombu because it is more lightweight and RabbitMQ is not so heavy loaded in my tests: results for rabbit: 5687 msg/sec results for pika: 6697 msg/sec 7) implements separately notifications and rpc messaging (using different exchanges, etc) which allows to use different configuration for different use cases (for example durable notification messaging and not durable rpc messaging) Plans for future development: 1) Implement configurable message serialisation (json - current implementation, msgpack) 2) Implement configurable connection factory (pool of connection - current implementation, single connection) 3) Now it is impossible to perform rolling update from kombu to pika, we need to implements some solution for cross driver rolling update 4) Polishing, bug fixing, profiling etc., as usual P.S. Thank everyone who have been helping to develop pika driver! __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo.messaging][devstack] Pika RabbitMQ driver implementation
Hello Joshua, thank you for your feedback. This will end up on review.openstack.org right so that it can be properly > reviewed (it will likely take a while since it looks to be ~1000+ lines of > code)? Yes, sure I will send this patch to review.openstack.org, but first of all I need to get merged devstack patch ( https://review.openstack.org/#/c/226348/). Then I will add gate jobs with testing new driver using devstack. And the will send pika driver patch to review. Also suggestion, before that merges, can docs be added, seems like very > little docstrings about what/why/how. For sustainability purposes that > would be appreciated I think. Ok. Will add. On Fri, Sep 25, 2015 at 6:58 PM, Joshua Harlow <harlo...@outlook.com> wrote: > Also a side question, that someone might know, > > Whatever happened to the folks from rabbitmq (incorporated? pivotal?) who > were going to get involved in oslo.messaging, did that ever happen; if > anyone knows? > > They might be a good bunch of people to review such a pika driver (since I > think they as a corporation created pika?). > > Dmitriy Ukhlov wrote: > >> Hello stackers, >> >> I'm working on new olso.messaging RabbitMQ driver implementation which >> uses pika client library instead of kombu. It related to >> https://blueprints.launchpad.net/oslo.messaging/+spec/rabbit-pika. >> In this letter I want to share current results and probably get first >> feedack from you. >> Now code is availabe here: >> >> https://github.com/dukhlov/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_pika.py >> >> Current status of this code: >> - pika driver passes functional tests >> - pika driver tempest smoke tests >> - pika driver passes almost all tempest full tests (except 5) but it >> seems that reason is not related to oslo.messaging >> Also I created small devstack patch to support pika driver testing on >> gate (https://review.openstack.org/#/c/226348/) >> >> Next steps: >> - communicate with Manish (blueprint owner) >> - write spec to this blueprint >> - send a review with this patch when spec and devstack patch get merged. >> >> Thank you. >> >> >> -- >> Best regards, >> Dmitriy Ukhlov >> Mirantis Inc. >> >> __ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: >> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >> > > __________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > -- Best regards, Dmitriy Ukhlov Mirantis Inc. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [oslo.messaging][devstack] Pika RabbitMQ driver implementation
Hello stackers, I'm working on new olso.messaging RabbitMQ driver implementation which uses pika client library instead of kombu. It related to https://blueprints.launchpad.net/oslo.messaging/+spec/rabbit-pika. In this letter I want to share current results and probably get first feedack from you. Now code is availabe here: https://github.com/dukhlov/oslo.messaging/blob/master/oslo_messaging/_drivers/impl_pika.py Current status of this code: - pika driver passes functional tests - pika driver tempest smoke tests - pika driver passes almost all tempest full tests (except 5) but it seems that reason is not related to oslo.messaging Also I created small devstack patch to support pika driver testing on gate ( https://review.openstack.org/#/c/226348/) Next steps: - communicate with Manish (blueprint owner) - write spec to this blueprint - send a review with this patch when spec and devstack patch get merged. Thank you. -- Best regards, Dmitriy Ukhlov Mirantis Inc. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [MagnetoDB] Andrey Ostapenko core nomination
Andrey is very active contributor, good team player and very helps us at previous development cycle. +1 from my side. On Fri, Dec 26, 2014 at 4:16 PM, isviridov isviri...@mirantis.com wrote: Hello stackers and magnetians, I suggest nominating Andrey Ostapenko [1] to MagnetoDB cores. During last months he has made huge contribution to MagnetoDB [2] Andrey drives Tempest and python-magnetodbclient successfully. Please rise your hands. Thank you, Ilya Sviridov [1] http://stackalytics.com/report/users/aostapenko [2] http://stackalytics.com/report/contribution/magnetodb/90 -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [MagnetoDB] Core developer nomination
+1 from me, Charles is very active contributor and I guess if we have such developer in core team MagnetoDB project will become much better that it is now. On Thu, Sep 18, 2014 at 10:09 AM, Illia Khudoshyn ikhudos...@mirantis.com wrote: Congrats, Charles! Great job! On Thu, Sep 18, 2014 at 12:05 AM, Ilya Sviridov isviri...@mirantis.com wrote: Hello magnetodb contributors, I'm glad to nominate Charles Wang to core developers of MagnetoDB. He is top non-core reviewer [1], implemented notifications [2] in mdb and made a great progress with performance, stability and scalability testing of MagnetoDB [1] http://stackalytics.com/report/contribution/magnetodb/90 [2] https://blueprints.launchpad.net/magnetodb/+spec/magnetodb-notifications Welcome to team, Charles! Looking forward for your contribution -- Ilya Sviridov isviridov @ FreeNode -- Best regards, Illia Khudoshyn, Software Engineer, Mirantis, Inc. 38, Lenina ave. Kharkov, Ukraine www.mirantis.com http://www.mirantis.ru/ www.mirantis.ru Skype: gluke_work ikhudos...@mirantis.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [magnetodb] Backup procedure for Cassandra backend
Hi Romain! Thank you for useful info about your Cassandra backuping. We have not tried to tune Cassandra compaction properties yet. MagnetoDB is DynamoDB-like REST API and it means that it is key-value storage itself and it should be able to work for different kind of load, because it depends on user application which use MagnetoDB. Do you have some recommendation or comments based on information about read/write ratio? On Tue, Sep 2, 2014 at 4:29 PM, Romain Hardouin romain.hardo...@cloudwatt.com wrote: Hi Mirantis guys, I have set up two Cassandra backups: The first backup procedure was similar to the one you want to achieve. The second backup used SAN features (EMC VNX snapshots) so it was very specific to the environment. Backup an entire cluster (therefore all replicas) is challenging when dealing with big data and not really needed. If your replicas are spread accross several data centers then you could backup just one data center. In that case you backup only one replica. Depending on your needs you may want to backup twice (I mean backup the backup using a tape library for example) and then store it in an external location for disaster recovery, requirements specification, norms, etc. The snapshot command issues a flush before to effectively take the snapshot. So the flush command is not necessary. https://github.com/apache/cassandra/blob/c7ebc01bbc6aa602b91e105b935d6779245c87d1/src/java/org/apache/cassandra/db/ColumnFamilyStore.java#L2213 (snapshotWithoutFlush() is used by the scrub command) Just out of curiosity, have you tried the leveled compaction strategy? It seems that you use STCS. Does your use case imply many updates? What is your read/write ratio? Best, Romain -- *From: *Denis Makogon dmako...@mirantis.com *To: *OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org *Sent: *Friday, August 29, 2014 4:33:59 PM *Subject: *Re: [openstack-dev] [magnetodb] Backup procedure for Cassandrabackend On Fri, Aug 29, 2014 at 4:29 PM, Dmitriy Ukhlov dukh...@mirantis.com wrote: Hello Denis, Thank you for very useful knowledge sharing. But I have one more question. As far as I understood if we have replication factor 3 it means that our backup may contain three copies of the same data. Also it may contain some not compacted sstables set. Do we have any ability to compact collected backup data before moving it to backup storage? Thanks for fast response, Dmitriy. With replication factor 3 - yes, this looks like a feature that allows to backup only one node instead of 3 of them. In other cases, we would need to iterate over each node, as you know. Correct, it is possible to have not compacted SSTables. To accomplish compaction we might need to use compaction mechanism provided by the nodetool, see http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsCompact.html, we just need take into account that it's possible that sstable was already compacted and force compaction wouldn't give valuable benefits. Best regards, Denis Makogon On Fri, Aug 29, 2014 at 2:01 PM, Denis Makogon dmako...@mirantis.com wrote: Hello, stackers. I'd like to start thread related to backuping procedure for MagnetoDB, to be precise, for Cassandra backend. In order to accomplish backuping procedure for Cassandra we need to understand how does backuping work. To perform backuping: 1. We need to SSH into each node 2. Call ‘nodetool snapshot’ with appropriate parameters 3. Collect backup. 4. Send backup to remote storage. 5. Remove initial snapshot Lets take a look how does ‘nodetool snapshot’ works. Cassandra backs up data by taking a snapshot of all on-disk data files (SSTable files) stored in the data directory. Each time an SSTable gets flushed and snapshotted it becomes a hard link against initial SSTable pinned to specific timestamp. Snapshots are taken per keyspace or per-CF and while the system is online. However, nodes must be taken offline in order to restore a snapshot. Using a parallel ssh tool (such as pssh), you can flush and then snapshot an entire cluster. This provides an eventually consistent backup. Although no one node is guaranteed to be consistent with its replica nodes at the time a snapshot is taken, a restored snapshot can resume consistency using Cassandra's built-in consistency mechanisms. After a system-wide snapshot has been taken, you can enable incremental backups on each node (disabled by default) to backup data that has changed since the last snapshot was taken. Each time an SSTable is flushed, a hard link is copied into a /backups subdirectory of the data directory. Now lets see how can we deal with snapshot once its taken. Below you can see a list of command that needs to be executed to prepare a snapshot: Flushing SSTables for consistency
Re: [openstack-dev] [magnetodb] Backup procedure for Cassandra backend
Hello Denis, Thank you for very useful knowledge sharing. But I have one more question. As far as I understood if we have replication factor 3 it means that our backup may contain three copies of the same data. Also it may contain some not compacted sstables set. Do we have any ability to compact collected backup data before moving it to backup storage? On Fri, Aug 29, 2014 at 2:01 PM, Denis Makogon dmako...@mirantis.com wrote: Hello, stackers. I'd like to start thread related to backuping procedure for MagnetoDB, to be precise, for Cassandra backend. In order to accomplish backuping procedure for Cassandra we need to understand how does backuping work. To perform backuping: 1. We need to SSH into each node 2. Call ‘nodetool snapshot’ with appropriate parameters 3. Collect backup. 4. Send backup to remote storage. 5. Remove initial snapshot Lets take a look how does ‘nodetool snapshot’ works. Cassandra backs up data by taking a snapshot of all on-disk data files (SSTable files) stored in the data directory. Each time an SSTable gets flushed and snapshotted it becomes a hard link against initial SSTable pinned to specific timestamp. Snapshots are taken per keyspace or per-CF and while the system is online. However, nodes must be taken offline in order to restore a snapshot. Using a parallel ssh tool (such as pssh), you can flush and then snapshot an entire cluster. This provides an eventually consistent backup. Although no one node is guaranteed to be consistent with its replica nodes at the time a snapshot is taken, a restored snapshot can resume consistency using Cassandra's built-in consistency mechanisms. After a system-wide snapshot has been taken, you can enable incremental backups on each node (disabled by default) to backup data that has changed since the last snapshot was taken. Each time an SSTable is flushed, a hard link is copied into a /backups subdirectory of the data directory. Now lets see how can we deal with snapshot once its taken. Below you can see a list of command that needs to be executed to prepare a snapshot: Flushing SSTables for consistency 'nodetool flush' Creating snapshots (for example of all keyspaces) nodetool snapshot -t %(backup_name)s 1/dev/null, where - backup_name - is a name of snapshot Once it’s done we would need to collect all hard links into a common directory (with keeping initial file hierarchy): sudo tar cpzfP /tmp/all_ks.tar.gz\ $(sudo find %(datadir)s -type d -name %(backup_name)s) where - backup_name - is a name of snapshot, - datadir - storage location (/var/lib/cassandra/data, by the default) Note that this operation can be extended: - if cassandra was launched with more than one data directory (see cassandra.yaml http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html ) - if we want to backup only: - certain keyspaces at the same time - one keyspace - a list of CF’s for given keyspace Useful links http://www.datastax.com/documentation/cassandra/2.0/cassandra/tools/toolsNodetool_r.html Best regards, Denis Makogon ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [MagnetoDB] MagnetoDB events notifications
Hi Charles! It looks like to me that we are duplicating functionality of Ceilometer project. Am I wrong? Have you considered Ceilometer integration for monitoring MagnetoDB? On Fri, May 23, 2014 at 6:55 PM, Charles Wang charles_w...@symantec.comwrote: Folks, Please take a look at the initial draft of MagnetoDB Events and Notifications wiki page: https://wiki.openstack.org/wiki/MagnetoDB/notification. Your feedback will be appreciated. Thanks, Charles Wang charles_w...@symantec.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] Question of necessary queries for Event implemented on HBase
Hello Igor, Sounds reasonable. On 05/21/2014 02:38 PM, Igor Degtiarov wrote: Hi, I have found that filter model for Events has mandatory parameters start_time and end_time of the events period. So, it seems that structure for rowkey as ''timestamp + event_id will be more suitable. -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] Question of necessary queries for Event implemented on HBase
Hello Igor! Could you clarify, please, Why do we need event_id + reversed_timestamp row key? Isn't event_id identify row? On Tue, Apr 29, 2014 at 11:08 AM, Igor Degtiarov idegtia...@mirantis.comwrote: Hi, everybody. I’ve started to work on implementation of Event in ceilometer on HBase backend in the edges of blueprint https://blueprints.launchpad.net/ceilometer/+spec/hbase-events-feature By now Events has been implemented only in SQL. You know, using SQL we can build any query we need. With HBase it is another story. The data structure is built basing on queries we need, so to construct the structure of Event on HBase, it is very important to answer the question what queries should be implemented to retrieve events from storage. I registered bp https://blueprints.launchpad.net/ceilometer/+spec/hbase-events-structurefor discussion Events structure in HBase. For today it is prepared preliminary structure of Events in HBase: table: Events - rowkey: event_id + reversed_timestamp - column: event_type = string with description of event - [list of columns: trait_id + trait_desc + trait_type= trait_data] Structure that is proposed will support next queries: - event’s generation time - event id - event type - trait: id, description, type Any thoughts about additional queries that are necessary for Events. I’ll publish the patch with current implementation soon. Sincerely, Igor Degtiarov ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [MagnetoDB] Configuring consistency draft of concept
Hello Maksym, Thank you for your work! I suggest you to consider more general approach and hide backend specific staff. I have the next proposal: 1) add support for inconsistent write operation by adding PutItem, UpdateItem and DeleteItem request parameters consistent = True of False (as well as GetItem and Query requests) 2) add possibility to set backend specific metadata (it would be nice to use some generic format like json) per table in scope of create table request. I suggest to specify mapping for Cassandra consistency level per operation type (consistent read, inconsistent read, consistent write, inconsistent write) I agree that now we have a limitation for inconsistent write operation on tables with indexed fields and for requests with specified expected conditions. I have thought about how to overcome this limitation and it seems that I found out solution for index handling without CAS operation. And maybe it is reasonable to redesign it a bit. On Mon, Apr 28, 2014 at 8:33 AM, MAKSYM IARMAK (CS) maksym_iar...@symantec.com wrote: Hi, Because of we can't use inconsistent write if we use indexed table and condition operations which indexes based on (this staff requires the state of data), we have one more issue. If we want to make write with consistency level ONE (WEAK) to the indexed table, we will have 2 variants: 1. Carry out the operation successfully and implicitly make write to the indexed table with minimally possible consistency level for it (QUORUM); 2. Raise an exception, that we can not perform this operation and list all possible CLs for this operation. I personally prefer the 2nd variant. So, does anybody have some objections or maybe another ideas? -- *From:* MAKSYM IARMAK (CS) [maksym_iar...@symantec.com] *Sent:* Friday, April 25, 2014 9:14 PM *To:* openstack-dev@lists.openstack.org *Subject:* [openstack-dev] [MagnetoDB] Configuring consistency draft of concept So, here is specification draft of concept. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] (no subject)
In my opinion it would be enough to read table schema from stdio, then it is possible to use pipe for input from any stream On Fri, Apr 25, 2014 at 6:25 AM, ANDREY OSTAPENKO (CS) andrey_ostape...@symantec.com wrote: Hello, everyone! Now I'm starting to implement cli client for KeyValue Storage service MagnetoDB. I'm going to use heat approach for cli commands, e.g. heat stack-create --template-file FILE, because we have too many parameters to pass to the command. For example, table creation command: magnetodb create-table --description-file FILE File will contain json data, e.g.: { table_name: data, attribute_definitions: [ { attribute_name: Attr1, attribute_type: S }, { attribute_name: Attr2, attribute_type: S }, { attribute_name: Attr3, attribute_type: S } ], key_schema: [ { attribute_name: Attr1, key_type: HASH }, { attribute_name: Attr2, key_type: RANGE } ], local_secondary_indexes: [ { index_name: IndexName, key_schema: [ { attribute_name: Attr1, key_type: HASH }, { attribute_name: Attr3, key_type: RANGE } ], projection: { projection_type: ALL } } ] } Blueprint: https://blueprints.launchpad.net/magnetodb/+spec/magnetodb-cli-client If you have any comments, please let me know. Best regards, Andrey Ostapenko -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [MagnetoDB] Doubt about test data correctness
Hello everyone! I found out that some MagnetoDB tests use test data with empty value. Is it correct? Is DynamoDB allows such behavior? Please take a look: https://github.com/stackforge/magnetodb/blob/master/tempest/api/keyvalue/stable/rest/test_put_item.py#L39 -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [MagnetoDB] Confusing Cassandra behavior
Hello everyone! Today I'm faced with unexpected Cassandra behavior. Please keep in mind that if you execute UPDATE query end set all fields to null (or empty collections for collection type) it can delete your record, but also it can only set values to null and keep record alive. It depends on how to record was created: using insert query or update query. Please take a look at reproduce steps here https://gist.github.com/dukhlov/11195881. FYI: Cassandra 2.0.7 has been released. As we know there were some fixes for condition operation and batch operations which are necessary for us. So It would be nice to update magnetodb devstack to use Cassandra 2.0.7 -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Fwd: [MagnetoDB] Confusing Cassandra behavior
-- Forwarded message -- From: Dmitriy Ukhlov dukh...@mirantis.com Date: Tue, Apr 22, 2014 at 10:56 PM Subject: [openstack-dev][MagnetoDB] Confusing Cassandra behavior To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Hello everyone! Today I'm faced with unexpected Cassandra behavior. Please keep in mind that if you execute UPDATE query end set all fields to null (or empty collections for collection type) it can delete your record, but also it can only set values to null and keep record alive. It depends on how to record was created: using insert query or update query. Please take a look at reproduce steps here https://gist.github.com/dukhlov/11195881. FYI: Cassandra 2.0.7 has been released. As we know there were some fixes for condition operation and batch operations which are necessary for us. So It would be nice to update magnetodb devstack to use Cassandra 2.0.7 -- Best regards, Dmitriy Ukhlov Mirantis Inc. -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [MagnetoDB] Best practices for uploading large amounts of data
On 03/28/2014 11:29 AM, Serge Kovaleff wrote: Hi Iliia, I would take a look into BSON http://bsonspec.org/ Cheers, Serge Kovaleff On Thu, Mar 27, 2014 at 8:23 PM, Illia Khudoshyn ikhudos...@mirantis.com mailto:ikhudos...@mirantis.com wrote: Hi, Openstackers, I'm currently working on adding bulk data load functionality to MagnetoDB. This functionality implies inserting huge amounts of data (billions of rows, gigabytes of data). The data being uploaded is a set of JSON's (for now). The question I'm interested in is a way of data transportation. For now I do streaming HTTP POST request from the client side with gevent.pywsgi on the server side. Could anybody suggest any (better?) approach for the transportation, please? What are best practices for that. Thanks in advance. -- Best regards, Illia Khudoshyn, Software Engineer, Mirantis, Inc. 38, Lenina ave. Kharkov, Ukraine www.mirantis.com http://www.mirantis.ru/ www.mirantis.ru http://www.mirantis.ru/ Skype: gluke_work ikhudos...@mirantis.com mailto:ikhudos...@mirantis.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Hi Iliia, I guess if we a talking about cassandra batch loading the fastest way is to generate sstables locally and load it into Cassandra via JMX or sstableloader http://www.datastax.com/dev/blog/bulk-loading If you want to implement bulk load via magnetodb layer (not to cassandra directly) you could try to use simple tcp socket and implement your binary protocol (using bson for example). Http is text protocol so using tcp socket can help you to avoid overhead of base64 encoding. In my opinion, working with HTTP and BSON is doubtful solution because you wil use 2 phase encoddung and decoding: 1) object to bson, 2) bson to base64, 3) base64 to bson, 4) bson to object 1) obect to json instead of 1) object to json, 2) json to object in case of HTTP + json Http streaming as I know is asynchronous type of http. You can expect performance growing thanks to skipping generation of http response on server side and waiting on for that response on client side for each chunk. But you still need to send almost the same amount of data. So if network throughput is your bottleneck - it doesn't help. If server side is your bottleneck - it doesn't help too. Also pay your attention that in any case, now MagnetoDB Cassandra Storage convert your data to CQL query which is also text. It would be nice to implement MagnetoDB BatchWriteItem operation via Cassandra sstable generation and loading via sstableloader, but unfortunately as I know this functionality support implemented only for Java world -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [magnetodb] Using gevent in MagnetoDB. OpenStack standards and approaches
Doug and Ryan, Thank you for your opinion! It is very important for us to gather experience as more as we can for passing OpenStack incubation painlessly. Also I would be glad to see you and everyone who is interested in MagnetoDB project on MagnetoDB design session MagnetoDB, key-value storage. OpenStack usecases in scope of OpenStack Juno Design Summit. On Wed, Mar 19, 2014 at 3:49 PM, Ryan Petrello ryan.petre...@dreamhost.comwrote: Dmitriy, Gunicorn + gevent + pecan play nicely together, and they're a combination I've used to good success in the past. Pecan even comes with some helpers for integrating with gunicorn: $ gunicorn_pecan pecan_config.py -k gevent -w4 http://pecan.readthedocs.org/en/latest/deployment.html?highlight=gunicorn#gunicorn --- Ryan Petrello Senior Developer, DreamHost ryan.petre...@dreamhost.com On Mar 18, 2014, at 2:51 PM, Dmitriy Ukhlov dukh...@mirantis.com wrote: Hello openstackers, We are working on MagnetoDB project and trying our best to follow OpenStack standards. So, MagnetoDB is aimed to be high performance scalable OpenStack based WSGI application which provide interface to high available distributed reliable key-value storage. We investigated best practices and separated the next points: * to avoid problems with GIL our application should be executed in single thread mode with non-blocking IO (using greenlets or another python specific approaches to rich this) * to make MagnetoDB scalable it is necessary to make MagnetoDB stateless. It allows us run a lot of independent MagnetoDB processes and switch all requests flow between them: * at single node to load all CPU's cores * at the different nodes for horizontal scalability * use Cassandra as most reliable and mature distributed key-value storage * use datastax python-driver as most modern cassandra python client which supports newest CQL3 and Cassandra native binary protocol features set So, considering this points The next technologies was chosen: * gevent as one of the fastest non-blocking single-thread WSGI server. It is based on greenlet library and supports monkey patching of standard threading library. It is necessary because of datastax python driver uses threading library and it's backlog has task to add gevent backlog. (We patched python-driver ourselves to enable this feature as temporary solution and waiting for new python-driver releases). It makes gevent more interesting to use than other analogs (like eventlet for example) * gunicorn as WSGI server which is able to run a few worker processes and master process for workers managing and routing request between them. Also it has integration with gevent and can run gevent based workers. We also analyzed analogues, such as uWSGI. It looks like more faster but unfortunately we didn't manage to work uWSGI in multi process mode with MagnetoDB application. Also I want to add that currently oslo wsgi framework is used for organizing request routing. I know that current OpenStack trend is to migrate WSGI services to Pecan wsgi framework. Maybe is it reasonable for MagnetoDB too. We would like to hear your opinions about the libraries and approaches we have chosen and would appreciate you help and support in order to find the best balance between performance, developer friendness and OpenStack standards. -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [magnetodb] Using gevent in MagnetoDB. OpenStack standards and approaches
Hello openstackers, We are working on MagnetoDB project and trying our best to follow OpenStack standards. So, MagnetoDB is aimed to be high performance scalable OpenStack based WSGI application which provide interface to high available distributed reliable key-value storage. We investigated best practices and separated the next points: 1. to avoid problems with GIL our application should be executed in single thread mode with non-blocking IO (using greenlets or another python specific approaches to rich this) 2. to make MagnetoDB scalable it is necessary to make MagnetoDB stateless. It allows us run a lot of independent MagnetoDB processes and switch all requests flow between them: 1. at single node to load all CPU's cores 2. at the different nodes for horizontal scalability 3. use Cassandra as most reliable and mature distributed key-value storage 4. use datastax python-driver as most modern cassandra python client which supports newest CQL3 and Cassandra native binary protocol features set So, considering this points The next technologies was chosen: 1. gevent as one of the fastest non-blocking single-thread WSGI server. It is based on greenlet library and supports monkey patching of standard threading library. It is necessary because of datastax python driver uses threading library and it's backlog has task to add gevent backlog. (We patched python-driver ourselves to enable this feature as temporary solution and waiting for new python-driver releases). It makes gevent more interesting to use than other analogs (like eventlet for example) 2. gunicorn as WSGI server which is able to run a few worker processes and master process for workers managing and routing request between them. Also it has integration with gevent and can run gevent based workers. We also analyzed analogues, such as uWSGI. It looks like more faster but unfortunately we didn't manage to work uWSGI in multi process mode with MagnetoDB application. Also I want to add that currently oslo wsgi framework is used for organizing request routing. I know that current OpenStack trend is to migrate WSGI services to Pecan wsgi framework. Maybe is it reasonable for MagnetoDB too. We would like to hear your opinions about the libraries and approaches we have chosen and would appreciate you help and support in order to find the best balance between performance, developer friendness and OpenStack standards. -- Best regards, Dmitriy Ukhlov Mirantis Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [MagnetoDB] Weekly IRC meeting schedule
+ On Thu, Mar 6, 2014 at 5:25 PM, Maksym Iarmak miar...@mirantis.com wrote: + 13.00 UTC is OK. 2014-03-06 17:18 GMT+02:00 Ilya Sviridov isviri...@mirantis.com: Any other opinions? On Thu, Mar 6, 2014 at 4:31 PM, Illia Khudoshyn ikhudos...@mirantis.comwrote: 1300UTC is fine for me On Thu, Mar 6, 2014 at 4:24 PM, Ilya Sviridov isviri...@mirantis.comwrote: Hello magnetodb contributors, I would like to suggest weekly IRC meetings on Thursdays, 1300 UTC. More technical details later. Let us vote by replying this email. With best regards, Ilya Sviridov ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards, Illia Khudoshyn, Software Engineer, Mirantis, Inc. 38, Lenina ave. Kharkov, Ukraine www.mirantis.com http://www.mirantis.ru/ www.mirantis.ru Skype: gluke_work ikhudos...@mirantis.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev