unsubscribe

2019-11-26 Thread @Nandan@
unsubscribe


Re: Improve data load performance

2018-08-14 Thread @Nandan@
Bro, Please explain your question as much as possible.
This is not a single line Q session where we will able to understand your
in-depth queries in a single line.
For better and suitable reply, Please ask a question and elaborate what
steps you took for your question and what issue are you getting and all..

I hope I am making it clear. Don't take it personally.

Thanks

On Wed, Aug 15, 2018 at 8:25 AM Abdul Patel  wrote:

> How can we improve data load performance?


Re: cqlsh COPY ... TO ... doesn't work if one node down

2018-07-01 Thread @Nandan@
CQL Copy command will not work in case if you are trying to copy from all
NODES because COPY command will check all N nodes UP and RUNNING Status.
If you want to complete then you have 2 options:-
1) Remove DOWN NODE from COPY command
2) Make it UP and NORMAL  status.



On Mon, Jul 2, 2018 at 9:15 AM, Anup Shirolkar <
anup.shirol...@instaclustr.com> wrote:

> Hi,
>
> The error shows that, the cqlsh connection with down node is failed.
> So, you should debug why it happened.
>
> Although, you have mentioned other node in cqlsh command '10.0.0.154'
> my guess is, the down node was present in connection pool, hence it was
> attempted for connection.
>
> Ideally the availability of data should not be hampered due
> to unavailability of one replica out of 5.
> Also the stack trace is about 'cqlsh' connection error.
>
> I think once you get your connection sorted, the COPY should work as usual.
>
> Regards,
> Anup
>
>
> On 30 June 2018 at 15:05, Dmitry Simonov  wrote:
>
>> Hello!
>>
>> I have cassandra cluster with 5 nodes.
>> There is a (relatively small) keyspace X with RF5.
>> One node goes down.
>>
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address  Load   Tokens   Owns (effective)  Host
>> ID   Rack
>> UN  10.0.0.82   253.64 MB  256  100.0%
>> 839bef9d-79af-422c-a21f-33bdcf4493c1  rack1
>> UN  10.0.0.154  255.92 MB  256  100.0%
>> ce23f3a7-67d2-47c0-9ece-7a5dd67c4105  rack1
>> UN  10.0.0.76   461.26 MB  256  100.0%
>> c8e18603-0ede-43f0-b713-3ff47ad92323  rack1
>> UN  10.0.0.94   575.78 MB  256  100.0%
>> 9a324dbc-5ae1-4788-80e4-d86dcaae5a4c  rack1
>> DN  10.0.0.47   ?  256  100.0%
>> 7b628ca2-4e47-457a-ba42-5191f7e5374b  rack1
>>
>> I try to export some data using COPY TO, but it fails after long retries.
>> Why does it fail?
>> How can I make a copy?
>> There must be 4 copies of each row on other (alive) replicas.
>>
>> cqlsh 10.0.0.154 -e "COPY X.Y TO 'backup/X.Y' WITH NUMPROCESSES=1"
>>
>> Using 1 child processes
>>
>> Starting copy of X.Y with columns [key, column1, value].
>> 2018-06-29 19:12:23,661 Failed to create connection pool for new host
>> 10.0.0.47:
>> Traceback (most recent call last):
>>   File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/cluster.py",
>> line 2476, in run_add_or_renew_pool
>> new_pool = HostConnection(host, distance, self)
>>   File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/pool.py",
>> line 332, in __init__
>> self._connection = session.cluster.connection_factory(host.address)
>>   File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/cluster.py",
>> line 1205, in connection_factory
>> return self.connection_class.factory(address, self.connect_timeout,
>> *args, **kwargs)
>>   File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/connection.py",
>> line 332, in factory
>> conn = cls(host, *args, **kwargs)
>>   File 
>> "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/io/asyncorereactor.py",
>> line 344, in __init__
>> self._connect_socket()
>>   File "/usr/lib/foobar/lib/python3.5/site-packages/cassandra/connection.py",
>> line 371, in _connect_socket
>> raise socket.error(sockerr.errno, "Tried connecting to %s. Last
>> error: %s" % ([a[4] for a in addresses], sockerr.strerror or sockerr))
>> OSError: [Errno None] Tried connecting to [('10.0.0.47', 9042)]. Last
>> error: timed out
>> 2018-06-29 19:12:23,665 Host 10.0.0.47 has been marked down
>> 2018-06-29 19:12:29,674 Error attempting to reconnect to 10.0.0.47,
>> scheduling retry in 2.0 seconds: [Errno None] Tried connecting to
>> [('10.0.0.47', 9042)]. Last error: timed out
>> 2018-06-29 19:12:36,684 Error attempting to reconnect to 10.0.0.47,
>> scheduling retry in 4.0 seconds: [Errno None] Tried connecting to
>> [('10.0.0.47', 9042)]. Last error: timed out
>> 2018-06-29 19:12:45,696 Error attempting to reconnect to 10.0.0.47,
>> scheduling retry in 8.0 seconds: [Errno None] Tried connecting to
>> [('10.0.0.47', 9042)]. Last error: timed out
>> 2018-06-29 19:12:58,716 Error attempting to reconnect to 10.0.0.47,
>> scheduling retry in 16.0 seconds: [Errno None] Tried connecting to
>> [('10.0.0.47', 9042)]. Last error: timed out
>> 2018-06-29 19:13:19,756 Error attempting to reconnect to 10.0.0.47,
>> scheduling retry in 32.0 seconds: [Errno None] Tried connecting to
>> [('10.0.0.47', 9042)]. Last error: timed out
>> 2018-06-29 19:13:56,834 Error attempting to reconnect to 10.0.0.47,
>> scheduling retry in 64.0 seconds: [Errno None] Tried connecting to
>> [('10.0.0.47', 9042)]. Last error: timed out
>> 2018-06-29 19:15:05,887 Error attempting to reconnect to 10.0.0.47,
>> scheduling retry in 128.0 seconds: [Errno None] Tried connecting to
>> [('10.0.0.47', 9042)]. Last error: timed out
>> 2018-06-29 19:17:18,982 Error attempting to reconnect to 10.0.0.47,
>> scheduling retry in 256.0 seconds: [Errno None] Tried connecting to
>> [('10.0.0.47', 

Re:

2018-06-19 Thread @Nandan@
Check with Java Using version.

On Tue, Jun 19, 2018 at 6:18 PM, Deniz Acay  wrote:

> Hello there,
>
> Let me get straight to the point. Yesterday our three node Cassandra
> production cluster had a problem and we could not find a solution yet.
> Before taking more radical actions, I would like to consult you about the
> issue.
>
> We are using Cassandra version 3.11.0. Cluster is living on AWS EC2 nodes
> of type m4.2xlarge with 32 GBs of RAM. Each node Dockerized using host
> networking mode. Two EBS SSD volumes are attached to each node, 32GB for
> commit logs (io1) and 4TB for data directory (gp2). We have been running
> smoothly for 7 months and filled %55 of data directory on each node.
> Now our C* nodes fail during bootstrapping phase. Let me paste the logs
> from system.log file from start to the time of error:
>
> INFO  [main] 2018-06-19 09:51:32,726 YamlConfigurationLoader.java:89 -
> Configuration location: file:/opt/apache-cassandra-3.
> 11.0/conf/cassandra.yaml
> INFO  [main] 2018-06-19 09:51:32,954 Config.java:481 - Node
> configuration:[allocate_tokens_for_keyspace=botanalytics; 
> authenticator=AllowAllAuthenticator;
> authorizer=AllowAllAuthorizer; auto_bootstrap=false; auto_snapshot=true;
> back_pressure_enabled=false; back_pressure_strategy=org.
> apache.cassandra.net.RateBasedBackPressure{high_ratio=0.9, factor=5,
> flow=FAST}; batch_size_fail_threshold_in_kb=50;
> batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024;
> broadcast_address=null; broadcast_rpc_address=null; 
> buffer_pool_use_heap_if_exhausted=true;
> cas_contention_timeout_in_ms=1000; cdc_enabled=false;
> cdc_free_space_check_interval_ms=250; 
> cdc_raw_directory=/var/data/cassandra/cdc_raw;
> cdc_total_space_in_mb=0; client_encryption_options=;
> cluster_name=Botanalytics Production; column_index_cache_size_in_kb=2;
> column_index_size_in_kb=64; commit_failure_policy=stop_commit;
> commitlog_compression=null; commitlog_directory=/var/data/cassandra_commitlog;
> commitlog_max_compression_buffers_in_pool=3;
> commitlog_periodic_queue_size=-1; commitlog_segment_size_in_mb=32;
> commitlog_sync=periodic; commitlog_sync_batch_window_in_ms=NaN;
> commitlog_sync_period_in_ms=1; commitlog_total_space_in_mb=8192;
> compaction_large_partition_warning_threshold_mb=100;
> compaction_throughput_mb_per_sec=1600; concurrent_compactors=null;
> concurrent_counter_writes=32; concurrent_materialized_view_writes=32;
> concurrent_reads=32; concurrent_replicates=null; concurrent_writes=64;
> counter_cache_keys_to_save=2147483647; counter_cache_save_period=7200;
> counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000;
> credentials_cache_max_entries=1000; credentials_update_interval_in_ms=-1;
> credentials_validity_in_ms=2000; cross_node_timeout=false;
> data_file_directories=[Ljava.lang.String;@662b4c69;
> disk_access_mode=auto; disk_failure_policy=best_effort;
> disk_optimization_estimate_percentile=0.95; 
> disk_optimization_page_cross_chance=0.1;
> disk_optimization_strategy=ssd; dynamic_snitch=true;
> dynamic_snitch_badness_threshold=0.1; 
> dynamic_snitch_reset_interval_in_ms=60;
> dynamic_snitch_update_interval_in_ms=100; 
> enable_scripted_user_defined_functions=false;
> enable_user_defined_functions=false; 
> enable_user_defined_functions_threads=true;
> encryption_options=null; endpoint_snitch=Ec2Snitch;
> file_cache_size_in_mb=null; gc_log_threshold_in_ms=200;
> gc_warn_threshold_in_ms=1000; hinted_handoff_disabled_datacenters=[];
> hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024;
> hints_compression=null; hints_directory=null; hints_flush_period_in_ms=1;
> incremental_backups=false; index_interval=null;
> index_summary_capacity_in_mb=null; 
> index_summary_resize_interval_in_minutes=60;
> initial_token=null; inter_dc_stream_throughput_outbound_megabits_per_sec=200;
> inter_dc_tcp_nodelay=false; internode_authenticator=null;
> internode_compression=dc; internode_recv_buff_size_in_bytes=0;
> internode_send_buff_size_in_bytes=0; key_cache_keys_to_save=2147483647;
> key_cache_save_period=14400; key_cache_size_in_mb=null;
> listen_address=172.31.6.233; listen_interface=null;
> listen_interface_prefer_ipv6=false; listen_on_broadcast_address=false;
> max_hint_window_in_ms=1080; max_hints_delivery_threads=2;
> max_hints_file_size_in_mb=128; max_mutation_size_in_kb=null;
> max_streaming_retries=3; max_value_size_in_mb=256;
> memtable_allocation_type=heap_buffers; memtable_cleanup_threshold=null;
> memtable_flush_writers=0; memtable_heap_space_in_mb=null;
> memtable_offheap_space_in_mb=null; min_free_space_per_drive_in_mb=50;
> native_transport_max_concurrent_connections=-1; native_transport_max_
> concurrent_connections_per_ip=-1; native_transport_max_frame_size_in_mb=256;
> native_transport_max_threads=128; native_transport_port=9042;
> native_transport_port_ssl=null; num_tokens=8; 
> otc_backlog_expiration_interval_ms=200;
> otc_coalescing_enough_coalesced_messages=8; 

Re: Restoring snapshot

2018-06-11 Thread @Nandan@
Before restoring check version od sstables which you are using to import
data into your schema.  , as you remove and add age, due to this, already
there will be no data for that column.
now if you want that your old data should be shown here, then you have to
use proper sstable for dumping the data.

On Tue, Jun 12, 2018 at 12:58 PM, Joseph Arriola 
wrote:

>
> Hi Vishal!
>
> Did you copy the sstables into data directory?
>
> another thing is, check the id of the table that is the same as casaandra
> has in the metadata with the directory.
>
> https://docs.datastax.com/en/dse/5.1/cql/cql/cql_using/
> useCreateTableCollisionFix.html
>
>
>
>
>
> El El lun, 11 de jun. de 2018 a las 12:24 p. m., Elliott Sims <
> elli...@backblaze.com> escribió:
>
>> It's possible that it's something more subtle, but keep in mind that
>> sstables don't include the schema.  If you've made schema changes, you need
>> to apply/revert those first or C* probably doesn't know what to do with
>> those columns in the sstable.
>>
>> On Sun, Jun 10, 2018 at 11:38 PM,  wrote:
>>
>>> Dear Community,
>>>
>>>
>>>
>>> I’ll appreciate if I can get some responses to the observation below:
>>>
>>>
>>>
>>> https://stackoverflow.com/q/50763067/5701173
>>>
>>>
>>>
>>> Thanks and regards,
>>>
>>> Vishal Sharma
>>>
>>>
>>> "*Confidentiality Warning*: This message and any attachments are
>>> intended only for the use of the intended recipient(s), are confidential
>>> and may be privileged. If you are not the intended recipient, you are
>>> hereby notified that any review, re-transmission, conversion to hard copy,
>>> copying, circulation or other use of this message and any attachments is
>>> strictly prohibited. If you are not the intended recipient, please notify
>>> the sender immediately by return email and delete this message and any
>>> attachments from your system.
>>>
>>> *Virus Warning:* Although the company has taken reasonable precautions
>>> to ensure no viruses are present in this email. The company cannot accept
>>> responsibility for any loss or damage arising from the use of this email or
>>> attachment."
>>>
>>
>>


Re: Cassandra Node free memory very low even with 32GB RAM

2018-06-04 Thread @Nandan@
Hi,
As per I understood, You increased your RAM and CPU but still, you are not
able to check free memory space.
Did you allocate Java Process Manually? As this is occurring because Java
process allocated memory automatically. for Memory visualization, please
use jconsole and check what is showing.


On Tue, Jun 5, 2018 at 4:22 AM, Leena Ghatpande 
wrote:

> We are on cassandra 3.7 version
> We have a 8 node production cluster, with 4 nodes  each across 2 DC
> The RF is set to 3 currently, and we have 2 large tables with upto
> 70Million rows
>
> We just upgraded our Production cluster from 4CPU , 12 GB RAM to 8 CPU 32
> GB Memory. Accordingly we increased our Heap Size to 8GB per the
> recommended guidelines.
>
> 4 boxes in one Datacenter show 2 to 3gb of free memory, but we have 3
> nodes in 1 datacenter that show free memory between 600MB and 200 MB only.
>
> The heap on the boxes is being reclaimed and we have no special off heap
> setting, we use the defaults.
>
> Is this something we need to be concerned with where 1 node has around
> only 200MB free? Is there any settings that we might have to look for to
> increase the free memory?
>
>
>
>


Re: performance on reading only the specific nonPk column

2018-05-21 Thread @Nandan@
First question:- [Just as Concern]
How are you making sure that your PK is giving Uniqueness?
For Example:- At the same time, 10 users will write data then how's your
schema going to tackle that.

Now on your question:-
does the read on the specific node happen first bringing all the metrics m1
- m100 and then the metric is  sliced in memory and retrieve ,  or the disk
read happens only on the sliced data m1 without bringing m1- m100  ?
In case of Selection, READ process will took place like below:-
First Cassandra will look into for ID = 10 then it will look in your
clustering range based on your timestamp given.



On Mon, May 21, 2018 at 4:34 PM, sujeet jog  wrote:

> Folks,
>
> consider a table with 100 metrics with (id , timestamp ) as key,
> if one wants to do a selective metric read
>
> select m1 from table where id = 10 and timestamp >= '2017-01-02
> :00:00:00'
> and timestamp <= '2017-01-02 04:00:00'
>
> does the read on the specific node happen first bringing all the metrics
> m1 - m100 and then the metric is  sliced in memory and retrieve ,  or the
> disk read happens only on the sliced data m1 without bringing m1- m100  ?
>
> here partition & clustering key is provided in the query, the question is
> more towards efficiency operation on this schema for read.
>
> create table {
> id : Int,.
> timestamp : timestamp ,
> m1 : Int,
> m2  : Int,
> m3  Int,
> m4  Int,
> ..
> ..
> m100 : Int
>
> Primary Key ( id, timestamp )
> }
>
> Thanks
>


Re: Data Deleted After a few days of being off

2018-02-26 Thread @Nandan@
Please check your error.log file once. As you checked it means Your data is
not coming into Data file. I will suggest you to check your error.log file
once.

On Tue, Feb 27, 2018 at 2:37 PM, A <htt...@yahoo.com.invalid> wrote:

> Hey there. Thanks for your response.
>
> Yes.  I absolutely inserted the data correctly and queried it many times
> through cqlsh> as well as through node.js.
>
> I started going through the logs and haven't noticed anything yet... Very
> unexpected behavior.
>
> Thanks.
> A
>
>
> Sent from Yahoo Mail for iPad <https://yho.com/footer0>
>
>
> On Monday, February 26, 2018, 9:23 PM, @Nandan@ <
> nandanpriyadarshi...@gmail.com> wrote:
>
> Hi A,
> As I am able to  understand your question :-
> 1) You inserted some data into your table and that was inserted
> successfully.
> 2) Then you stop the cassandra servie.
> 3) After few days you started your service and you checked your Table and
> there were no data.
> Did you cross verified once you inserted data that , data is inserted
> correctly or not. and Did you check system.log file , In case you got some
> message there.
> Please cross check once.
>
> On Tue, Feb 27, 2018 at 1:16 PM, A <htt...@yahoo.com.invalid> wrote:
>
> I'm new to Cassandra.  Trying it out to see if it will work for my
> upcoming project.  I created a test keyspace and table on my dev laptop.
> Loaded it with some data on a Friday and closed her down.  Returned on
> Monday and looked up the data and it was gone.  The keyspace and table was
> there, but table was empty.  This has happened twice so far.
>
> Help...
>
> Thanks,
> Angel
>
>
>


Re: Data Deleted After a few days of being off

2018-02-26 Thread @Nandan@
Hi A,
As I am able to  understand your question :-
1) You inserted some data into your table and that was inserted
successfully.
2) Then you stop the cassandra servie.
3) After few days you started your service and you checked your Table and
there were no data.
Did you cross verified once you inserted data that , data is inserted
correctly or not. and Did you check system.log file , In case you got some
message there.
Please cross check once.

On Tue, Feb 27, 2018 at 1:16 PM, A  wrote:

> I'm new to Cassandra.  Trying it out to see if it will work for my
> upcoming project.  I created a test keyspace and table on my dev laptop.
> Loaded it with some data on a Friday and closed her down.  Returned on
> Monday and looked up the data and it was gone.  The keyspace and table was
> there, but table was empty.  This has happened twice so far.
>
> Help...
>
> Thanks,
> Angel
>


Re: Cassandra is not running

2018-02-13 Thread @Nandan@
Downgrade your Java version please

On Feb 13, 2018 11:46 PM, "Irtiza Ali"  wrote:

> Thank you
>
> On 13 Feb 2018 20:45, "Jürgen Albersdorfer" 
> wrote:
>
>> 1.8.0_161 is not yet supported - try 1.8.0_151
>>
>> Am 13.02.2018 um 16:44 schrieb Irtiza Ali :
>>
>> 1.8.0_161
>>
>>
>>


Re: Cassandra is not running

2018-02-13 Thread @Nandan@
What error message or WARN you got in system.log file and also check
output.log file ..


On Feb 13, 2018 11:34 PM, "Irtiza Ali"  wrote:

> What should I do now?
>
> On 13 Feb 2018 20:21, "Irtiza Ali"  wrote:
>
>> Thank you i will check it
>>
>>
>> On 13 Feb 2018 20:16, "Nicolas Guyomar" 
>> wrote:
>>
>>> Hi,
>>>
>>> Such a quick failure might indicate that you are trying to start
>>> Cassandra with the latest JDK which is not yet supported.
>>>
>>> You should have a look at the /var/log/system.log
>>>
>>> On 13 February 2018 at 16:03, Irtiza Ali  wrote:
>>>
 Hello everyone:

 Whenever I try to run the Cassandra it stop with status:

 result of [sudo service cassandra status] command:

 ● cassandra.service - LSB: distributed storage system for structured
 data
Loaded: loaded (/etc/init.d/cassandra; bad; vendor preset: enabled)
Active: active (exited) since 2018-02-13 19:57:51 PKT; 31s ago
  Docs: man:systemd-sysv-generator(8)
   Process: 28844 ExecStop=/etc/init.d/cassandra stop (code=exited,
 status=0/SUCCESS)
   Process: 28929 ExecStart=/etc/init.d/cassandra start (code=exited,
 status=0/SUCCESS)

 Anyone knows why cassandra is not running properly?

 With Regards
 Irtiza Ali

>>>
>>>


Re: does copy command will clear all the old data?

2018-02-12 Thread @Nandan@
Hi Peng,
Copy command will append upsert all data [based on size] to your existing
Cassandra table. Just for testing, I executed before for 50 data by
using COPY command into small cql table and it work very fine.

Point to make sure :- Please check your primary key before play with COPY
command.

Thanks


On Tue, Feb 13, 2018 at 12:49 PM, Peng Xiao <2535...@qq.com> wrote:

> Dear All,
>
> I'm trying to import csv file to a table with copy command?The question is:
> will the copy command clear all the old data in this table?as we only want
> to append the csv file to this table
>
> Thanks
>


Use Oracle Virtual Machine to Set Up Multiple Node 2 DC cluster

2018-01-11 Thread @Nandan@
Hi ,
Can you please help me to know, how to set up Networks into ORACLE Virtual
Box machine by which I can able to achieve multiple DC multiple Node
configuration.

I am using window 10 as the host machine. I tried to setup using NAT but it
is not working.

Please provide some idea. Thanks.


Nandan


Reg:- limitation as PROS and CONS of Using Collections in Data modeling

2018-01-01 Thread @Nandan@
Hi All,
I want to know, what will be a limitation in case of using Collections such
as SET, LIST, MAP?
Like in my condition, which inserting Video details, I have to insert
language based such as
Language:- English
Title:- Video Name
Language:- Hindi
Title:- Video_name in Hindi
Language:- Chinese
Title:- Video_name in Chinese.
The condition for using this, like my website is having multi-language
support condition, so If User will open English Page, Video title will show
in English and When User open Chinese page, then Video_title in Chinese
will be displayed.

Please suggest me.

Thanks and Best Regards,


Reg:- Data Modelling Concept ** Amazon Video **

2017-12-21 Thread @Nandan@
Hi All,

For Self Exploring, I am trying to do data modeling for Amazon Video [For
learning purpose] and trying to check, is it possible to do data modeling
for Amazon Video or not.

Below are the details:-
Amazon Video Contains different columns such as:-
1) Video_title  -> One VIdeo One Title
2) Release_year -> One Video One Year
3) Series -> One video, Multiple Series
4) Rating -> Total no. of Rating
5) Description -> One Video one Description.
6) Actors / Supporting actors -> Name contains first_name and Last_name as
well as multiple Characters with Role details.
7) Language -> One Video, Many Language
8) Runtime -> One video One Runtime
9) Genres -> One Video, Many Gener
10) Director -> One Video Many Director
11) Producer -> One Video, Many Producer,

My Query plans are:-
1) Select Video by title
2) Select all videos who released into same year
3) Select vides who released in Particular year range.
4) All video titles in Particular Series.
5) Select videos by language
6) Select Videos by Actor
7) Select videos by Director
8) Select videos by Producer
9) Select Videos by Producer in Specific Year range
etc

Please provide me some details by which I can able to do it and solve it
all queries in coming time.

Thanks.
Nandan Priyadarshi


Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-07 Thread @Nandan@
Thanks. But again my  questions come back at the same place that how to do
data modeling because If we will do denormalized then we have to allow a
lot of data duplication, as well as Insert and Update, will also need to
think because based on this we have to insert data into multiple tables at
same time.


On Fri, Dec 8, 2017 at 10:54 AM, Jon Haddad <j...@jonhaddad.com> wrote:

> I mean ES is great as a search engine.  I would use Cassandra as my source
> of truth, and also index my data in ES.
>
> I typed my original message before I walked my dog, I should have also
> pointed out https://github.com/strapdata/elassandra and https
> ://github.com/Stratio/cassandra-lucene-index, but I haven’t used either
> one.
>
> Jon
>
>
> On Dec 7, 2017, at 5:59 PM, @Nandan@ <nandanpriyadarshi...@gmail.com>
> wrote:
>
> Hi Jon,
> Do you mean Elastic search for storing data or Data should be store into
> Cassandra and use Elastic Search for Select records from tables. ?
>
>
> On Fri, Dec 8, 2017 at 9:50 AM, Jon Haddad <j...@jonhaddad.com> wrote:
>
>> 1. No, Apache Cassandra is pretty terrible for search on it’s own.  Even
>> with SASI.
>> 2. Maybe, but it’s complicated, and doing it right takes a lot of
>> experience.  I’d use Elastic Search instead.
>>
>>
>>
>> > On Dec 7, 2017, at 5:39 PM, @Nandan@ <nandanpriyadarshi...@gmail.com>
>> wrote:
>> >
>> > Hi Peoples,
>> >
>> > As currently around the world 60-70% websites are excelling with
>> E-commerce in which we have to store huge amount of data and select pattern
>> based on Partial Search, Text match, Full-Text Search and all.
>> >
>> > So below questions comes to mind :
>> > 1) Is Cassandra a correct choice for data modeling which gives complex
>> Search patterned as  Amazon or eBay is using?
>> > 2) If we will use denormalized data modeling then is it will be
>> effective?
>> >
>> > Please clarify this.
>> >
>> > Thanks and Best regards,
>> > Nandan Priyadarshi
>>
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>
>


Re: Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-07 Thread @Nandan@
Hi Jon,
Do you mean Elastic search for storing data or Data should be store into
Cassandra and use Elastic Search for Select records from tables. ?


On Fri, Dec 8, 2017 at 9:50 AM, Jon Haddad <j...@jonhaddad.com> wrote:

> 1. No, Apache Cassandra is pretty terrible for search on it’s own.  Even
> with SASI.
> 2. Maybe, but it’s complicated, and doing it right takes a lot of
> experience.  I’d use Elastic Search instead.
>
>
>
> > On Dec 7, 2017, at 5:39 PM, @Nandan@ <nandanpriyadarshi...@gmail.com>
> wrote:
> >
> > Hi Peoples,
> >
> > As currently around the world 60-70% websites are excelling with
> E-commerce in which we have to store huge amount of data and select pattern
> based on Partial Search, Text match, Full-Text Search and all.
> >
> > So below questions comes to mind :
> > 1) Is Cassandra a correct choice for data modeling which gives complex
> Search patterned as  Amazon or eBay is using?
> > 2) If we will use denormalized data modeling then is it will be
> effective?
> >
> > Please clarify this.
> >
> > Thanks and Best regards,
> > Nandan Priyadarshi
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Reg:- Data modelling For E-Commerce Pattern data modelling for Search

2017-12-07 Thread @Nandan@
Hi Peoples,

As currently around the world 60-70% websites are excelling with E-commerce
in which we have to store huge amount of data and select pattern based on
Partial Search, Text match, Full-Text Search and all.

So below questions comes to mind :
1) Is Cassandra a correct choice for data modeling which gives complex
Search patterned as  Amazon or eBay is using?
2) If we will use denormalized data modeling then is it will be effective?

Please clarify this.

Thanks and Best regards,
Nandan Priyadarshi


Re: update a record which does not exists

2017-12-03 Thread @Nandan@
In Cassandra, We are having Upsert concept. That means, Suppose if you
inserting data then it is going to store based on your PK, once you will
Update records based on PK, then there will be 2 possibilities as:-
1) Either It updates the Existed data with new Timestamp [as we know that
record having VALUE as well as TIMESTAMP].
2) Insert a new record in case of non-existing data.

Thanks.

On Mon, Dec 4, 2017 at 11:47 AM, Peng Xiao <2535...@qq.com> wrote:

> After test,it do will insert a new record.
>
>
> -- Original --
> *From: * "我自己的邮箱";<2535...@qq.com>;
> *Date: * Mon, Dec 4, 2017 11:13 AM
> *To: * "user";
> *Subject: * update a record which does not exists
>
> Dear All,
> If we update a record which actually does not exist in Cassandra,will it
> generate a new record or exit?
>
> UPDATE columnfamily SET data = 'test data' WHERE key = 'row1';
> as in CQL Update and insert are semantically the same.Could anyone please
> advise?
>
> Thanks,
> Peng Xiao
>
>


Re: Nodetool repair on read only cluster

2017-11-29 Thread @Nandan@
Hi Roger,
As you provide incomplete information which is so tough to analyse .
But if you like to refer then please check below JIRA link to check out is
it useful or not. ?
https://issues.apache.org/jira/browse/CASSANDRA-6616

Thanks.

On Thu, Nov 30, 2017 at 9:42 AM, Roger Warner  wrote:

>
>
> What would running a repair on a cluster do when there are no deletes nor
> have there ever been?I have no deletes yet on my data.Yet running a
> repair took over 9 hours on a 5 node cluster?
>
>
>
> Roger?
>


Re: Reg:- CassandraRoleManager skipped default role setup Issue

2017-11-23 Thread @Nandan@
Hi Jai,
As you suggested, I stopped my first node and restart now again. Now
updates are like this.

> On Node1:-
> WARN message is not coming.
> On Node 2:-
> WARN: CassandraRoleManager Skipped
> WARN: 10.0.0.3 node seems to be down
> On Node 3:-
> No Such Warning,




On Fri, Nov 24, 2017 at 9:08 AM, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> yes,
>
> I had it in one of my cluster. try restarting the node once again and see
> if it creates the default role.
>
> On Thu, Nov 23, 2017 at 5:05 PM, @Nandan@ <nandanpriyadarshi...@gmail.com>
> wrote:
>
>> So It means you are also getting same WARN into your output.log file.
>> I am also getting this WARN into my NODE1.
>>
>>
>> On Fri, Nov 24, 2017 at 7:03 AM, Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>>
>>> I ran into similar issue before with 2.1.13 version of C*. and when I
>>> restart the node second time it actually created the default roles. I
>>> haven't dig deeper on the root cause. it happened to me only on one cluster
>>> out of 10+ clusters.
>>>
>>> On Wed, Nov 22, 2017 at 5:13 PM, @Nandan@ <nandanpriyadarshi...@gmail.co
>>> m> wrote:
>>>
>>>> Hi Jai,
>>>> I checked nodetool describecluster and got same schema version on all 4
>>>> nodes.
>>>>
>>>>> [nandan@node-1 ~]$ nodetool describecluster
>>>>
>>>> Cluster Information:
>>>> Name: Nandan
>>>> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
>>>> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>>>> Schema versions:
>>>> 2e2ab56b-6639-394e-a1fe-4b35ba87473b: [10.0.0.2, 10.0.0.3, 10.0.0.4,
>>>> 10.0.0.1]
>>>>
>>>>
>>>> Thanks and best regards,
>>>> Nandan
>>>>
>>>> On Thu, Nov 23, 2017 at 5:37 AM, Jai Bheemsen Rao Dhanwada <
>>>> jaibheem...@gmail.com> wrote:
>>>>
>>>>> Can you do a nodetool describecluster and check if the schema version
>>>>> is matching on all the nodes?
>>>>>
>>>>>
>>>>> On Tue, Nov 21, 2017 at 11:52 PM, @Nandan@ <
>>>>> nandanpriyadarshi...@gmail.com> wrote:
>>>>>
>>>>>> Hi Team,
>>>>>>
>>>>>> Today I set up a test cluster with 4 nodes and using Apache Cassandra
>>>>>> 3.1.1 version.
>>>>>> After setup when I checked output.log file then I got WARN message as
>>>>>> below :-
>>>>>> WARN  08:51:38,122  CassandraRoleManager.java:355 -
>>>>>> CassandraRoleManager skipped default role setup: some nodes were not 
>>>>>> ready
>>>>>> WARN  08:51:46,269  DseDaemon.java:733 - The following nodes seems to
>>>>>> be down: [/10.0.0.2, /10.0.0.3, /10.0.0.4]. Some Cassandra
>>>>>> operations may fail with UnavailableException.
>>>>>>
>>>>>> But I checked Nodetool status and that is totally working fine and
>>>>>> all nodes are in UN status.
>>>>>>
>>>>>> Please tell me what have I need to check for this. ?
>>>>>> Thanks in Advance,
>>>>>> Nandan Priyadarshi
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


Re: Reg:- CassandraRoleManager skipped default role setup Issue

2017-11-23 Thread @Nandan@
So It means you are also getting same WARN into your output.log file.
I am also getting this WARN into my NODE1.


On Fri, Nov 24, 2017 at 7:03 AM, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> I ran into similar issue before with 2.1.13 version of C*. and when I
> restart the node second time it actually created the default roles. I
> haven't dig deeper on the root cause. it happened to me only on one cluster
> out of 10+ clusters.
>
> On Wed, Nov 22, 2017 at 5:13 PM, @Nandan@ <nandanpriyadarshi...@gmail.com>
> wrote:
>
>> Hi Jai,
>> I checked nodetool describecluster and got same schema version on all 4
>> nodes.
>>
>>> [nandan@node-1 ~]$ nodetool describecluster
>>
>> Cluster Information:
>> Name: Nandan
>> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
>> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
>> Schema versions:
>> 2e2ab56b-6639-394e-a1fe-4b35ba87473b: [10.0.0.2, 10.0.0.3, 10.0.0.4,
>> 10.0.0.1]
>>
>>
>> Thanks and best regards,
>> Nandan
>>
>> On Thu, Nov 23, 2017 at 5:37 AM, Jai Bheemsen Rao Dhanwada <
>> jaibheem...@gmail.com> wrote:
>>
>>> Can you do a nodetool describecluster and check if the schema version is
>>> matching on all the nodes?
>>>
>>>
>>> On Tue, Nov 21, 2017 at 11:52 PM, @Nandan@ <
>>> nandanpriyadarshi...@gmail.com> wrote:
>>>
>>>> Hi Team,
>>>>
>>>> Today I set up a test cluster with 4 nodes and using Apache Cassandra
>>>> 3.1.1 version.
>>>> After setup when I checked output.log file then I got WARN message as
>>>> below :-
>>>> WARN  08:51:38,122  CassandraRoleManager.java:355 -
>>>> CassandraRoleManager skipped default role setup: some nodes were not ready
>>>> WARN  08:51:46,269  DseDaemon.java:733 - The following nodes seems to
>>>> be down: [/10.0.0.2, /10.0.0.3, /10.0.0.4]. Some Cassandra operations
>>>> may fail with UnavailableException.
>>>>
>>>> But I checked Nodetool status and that is totally working fine and all
>>>> nodes are in UN status.
>>>>
>>>> Please tell me what have I need to check for this. ?
>>>> Thanks in Advance,
>>>> Nandan Priyadarshi
>>>>
>>>
>>>
>>
>


Re: Reg:- CassandraRoleManager skipped default role setup Issue

2017-11-22 Thread @Nandan@
Hi Jai,
I checked nodetool describecluster and got same schema version on all 4
nodes.

> [nandan@node-1 ~]$ nodetool describecluster

Cluster Information:
Name: Nandan
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
2e2ab56b-6639-394e-a1fe-4b35ba87473b: [10.0.0.2, 10.0.0.3, 10.0.0.4,
10.0.0.1]


Thanks and best regards,
Nandan

On Thu, Nov 23, 2017 at 5:37 AM, Jai Bheemsen Rao Dhanwada <
jaibheem...@gmail.com> wrote:

> Can you do a nodetool describecluster and check if the schema version is
> matching on all the nodes?
>
>
> On Tue, Nov 21, 2017 at 11:52 PM, @Nandan@ <nandanpriyadarshi...@gmail.com
> > wrote:
>
>> Hi Team,
>>
>> Today I set up a test cluster with 4 nodes and using Apache Cassandra
>> 3.1.1 version.
>> After setup when I checked output.log file then I got WARN message as
>> below :-
>> WARN  08:51:38,122  CassandraRoleManager.java:355 - CassandraRoleManager
>> skipped default role setup: some nodes were not ready
>> WARN  08:51:46,269  DseDaemon.java:733 - The following nodes seems to be
>> down: [/10.0.0.2, /10.0.0.3, /10.0.0.4]. Some Cassandra operations may
>> fail with UnavailableException.
>>
>> But I checked Nodetool status and that is totally working fine and all
>> nodes are in UN status.
>>
>> Please tell me what have I need to check for this. ?
>> Thanks in Advance,
>> Nandan Priyadarshi
>>
>
>


Reg:- CassandraRoleManager skipped default role setup Issue

2017-11-21 Thread @Nandan@
Hi Team,

Today I set up a test cluster with 4 nodes and using Apache Cassandra 3.1.1
version.
After setup when I checked output.log file then I got WARN message as below
:-
WARN  08:51:38,122  CassandraRoleManager.java:355 - CassandraRoleManager
skipped default role setup: some nodes were not ready
WARN  08:51:46,269  DseDaemon.java:733 - The following nodes seems to be
down: [/10.0.0.2, /10.0.0.3, /10.0.0.4]. Some Cassandra operations may fail
with UnavailableException.

But I checked Nodetool status and that is totally working fine and all
nodes are in UN status.

Please tell me what have I need to check for this. ?
Thanks in Advance,
Nandan Priyadarshi


Re: Solr Search With Apache Cassandra

2017-11-20 Thread @Nandan@
Sorry, For my mistakes.  I mean that  Using Apache Cassandra for storage
and Solr for search facility from DSE.
Thanks for your Suggestion, will check with DSE mates.

Thanks
Nandan

On Mon, Nov 20, 2017 at 5:36 PM, Jacques-Henri Berthemet <
jacques-henri.berthe...@genesys.com> wrote:

> How are Cassandra and Solr related? they are two separate products.
>
>
>
> *--*
>
> *Jacques-Henri Berthemet*
>
>
>
> *From:* @Nandan@ [mailto:nandanpriyadarshi...@gmail.com]
> *Sent:* lundi 20 novembre 2017 10:04
> *To:* user <user@cassandra.apache.org>
> *Subject:* Re: Solr Search With Apache Cassandra
>
>
>
> Hi Jacques,
>
>
>
> For testing, I configure Apache Cassandra and Solr.
>
> and Using Solr Admin for testing the query.
>
>
>
> Thanks
>
>
>
> On Mon, Nov 20, 2017 at 4:37 PM, Jacques-Henri Berthemet <
> jacques-henri.berthe...@genesys.com> wrote:
>
> Hi,
>
>
>
> Apache Cassandra does not have Solr search, it’s Datastax Entreprise that
> supports such feature, you should contact Datastax support for such
> questions.
>
>
>
> Regards,
>
> *--*
>
> *Jacques-Henri Berthemet*
>
>
>
> *From:* @Nandan@ [mailto:nandanpriyadarshi...@gmail.com]
> *Sent:* lundi 20 novembre 2017 06:44
> *To:* user <user@cassandra.apache.org>
> *Subject:* Solr Search With Apache Cassandra
>
>
>
> Hi All,
>
> How Solr Search affect the READ operation from Cassandra?
>
> I am having a table with 100 columns with Primary Key as UUID.
>
> Note:- I am having 100 columns in a single table because of implemented
> Advance search on multiple columns like E-commerce.
>
>
>
> Now my concerns are:-
>
> 1) whenever do READ from a table based on a Primary key such as
>
> select * from table1 where col1 = UUID;
>
> It's working perfectly.
>
> 2) When even do READ from table with using solr on 1,2 columns such as
>
> using col1:val1 and col2:val2
>
> it is working also perfect.
>
> 3) But when I am performing a complex search, it is taking time such as
> 4-5 seconds.
>
> even currently READ and WRITE operations are not on the massive scale.
>
>
>
> So please tell me what's the cause and how to resolve this.
>
> Thanks
>
> Nandan Priyadarshi
>
>
>


Re: Solr Search With Apache Cassandra

2017-11-20 Thread @Nandan@
Hi Jacques,

For testing, I configure Apache Cassandra and Solr.
and Using Solr Admin for testing the query.

Thanks

On Mon, Nov 20, 2017 at 4:37 PM, Jacques-Henri Berthemet <
jacques-henri.berthe...@genesys.com> wrote:

> Hi,
>
>
>
> Apache Cassandra does not have Solr search, it’s Datastax Entreprise that
> supports such feature, you should contact Datastax support for such
> questions.
>
>
>
> Regards,
>
> *--*
>
> *Jacques-Henri Berthemet*
>
>
>
> *From:* @Nandan@ [mailto:nandanpriyadarshi...@gmail.com]
> *Sent:* lundi 20 novembre 2017 06:44
> *To:* user <user@cassandra.apache.org>
> *Subject:* Solr Search With Apache Cassandra
>
>
>
> Hi All,
>
> How Solr Search affect the READ operation from Cassandra?
>
> I am having a table with 100 columns with Primary Key as UUID.
>
> Note:- I am having 100 columns in a single table because of implemented
> Advance search on multiple columns like E-commerce.
>
>
>
> Now my concerns are:-
>
> 1) whenever do READ from a table based on a Primary key such as
>
> select * from table1 where col1 = UUID;
>
> It's working perfectly.
>
> 2) When even do READ from table with using solr on 1,2 columns such as
>
> using col1:val1 and col2:val2
>
> it is working also perfect.
>
> 3) But when I am performing a complex search, it is taking time such as
> 4-5 seconds.
>
> even currently READ and WRITE operations are not on the massive scale.
>
>
>
> So please tell me what's the cause and how to resolve this.
>
> Thanks
>
> Nandan Priyadarshi
>


Solr Search With Apache Cassandra

2017-11-19 Thread @Nandan@
Hi All,
How Solr Search affect the READ operation from Cassandra?
I am having a table with 100 columns with Primary Key as UUID.
Note:- I am having 100 columns in a single table because of implemented
Advance search on multiple columns like E-commerce.

Now my concerns are:-
1) whenever do READ from a table based on a Primary key such as
select * from table1 where col1 = UUID;
It's working perfectly.
2) When even do READ from table with using solr on 1,2 columns such as
using col1:val1 and col2:val2
it is working also perfect.
3) But when I am performing a complex search, it is taking time such as 4-5
seconds.
even currently READ and WRITE operations are not on the massive scale.

So please tell me what's the cause and how to resolve this.
Thanks
Nandan Priyadarshi


Re: Reg :- Multiple Node Cluster set up in Virtual Box

2017-11-07 Thread @Nandan@
Hi Users,
I successfully configured 2 Nodes Cluster , but when I configured 3rd Node
and try to join them , then 3rd node is not able to join Cluster and I am
getting below message. Please correct me in case of I am wrong at some
place.

> WARN  [GossipStage:1] 2017-11-07 17:01:42,706 TokenMetadata.java:215 -
> Token -2625048720051242117 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,707 TokenMetadata.java:215 -
> Token 2046352110807728035 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,708 TokenMetadata.java:215 -
> Token 6738112847220178646 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,710 TokenMetadata.java:215 -
> Token 5278402616817535783 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,710 TokenMetadata.java:215 -
> Token -4301762166942209316 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,711 TokenMetadata.java:215 -
> Token -5795150382485882189 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,711 TokenMetadata.java:215 -
> Token -7650474240486110510 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,712 TokenMetadata.java:215 -
> Token -7529017452803179703 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,713 TokenMetadata.java:215 -
> Token -6321052415922186365 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,713 TokenMetadata.java:215 -
> Token 505028918401730880 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,714 TokenMetadata.java:215 -
> Token -6468981120406928540 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,714 TokenMetadata.java:215 -
> Token -7886589494812723279 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,715 TokenMetadata.java:215 -
> Token 6159957549175666279 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,715 TokenMetadata.java:215 -
> Token 1371713730179023942 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,716 TokenMetadata.java:215 -
> Token 3849374212689985831 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,717 TokenMetadata.java:215 -
> Token -9130845474615238557 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,717 TokenMetadata.java:215 -
> Token -2166821314373815731 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,718 TokenMetadata.java:215 -
> Token 8172072992908340093 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,719 TokenMetadata.java:215 -
> Token 5862934089465703397 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,719 TokenMetadata.java:215 -
> Token 8683259820133858856 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,720 TokenMetadata.java:215 -
> Token -6722468819050104438 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,720 TokenMetadata.java:215 -
> Token 390606262920645821 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,721 TokenMetadata.java:215 -
> Token -2191770340916232589 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,721 TokenMetadata.java:215 -
> Token -2315498798364455538 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,722 TokenMetadata.java:215 -
> Token -4289328221922359195 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,722 TokenMetadata.java:215 -
> Token -3989195857321645521 changing ownership from /172.16.51.160 to /
> 172.16.51.185
> WARN  [GossipStage:1] 2017-11-07 17:01:42,723 TokenMetadata.java:215 -
> Token -6852043742105779264 changing ownership from /172.16.51.160 to /
> 172.16.51.185


Thanks and Best Regards,
Nandan


On Tue, Nov 7, 2017 at 12:48 PM, @Nandan@ <nandanpriyadarshi...@gmail.com>
wrote:

> Hi All,
>
> Thanks 

Re: Reg :- Multiple Node Cluster set up in Virtual Box

2017-11-06 Thread @Nandan@
Hi All,

Thanks for sharing all information.
I am starting to work on this.
Now Problem which I am getting right now is:-
1) How to select Network for Virtual Machine by which I can able to get
different IP for different Virtual Box?
2) As I am using WIFI for HOST machine which is Windows 10, so is there any
internal configuration required or I need to select specific Network
Adapter into Virtual Boxs by which  I will get IP1,IP2,IP3 for
node1,node2,node3 respectively.

Please give me some ideas.
Thanks in advance,
Nandan Priyadarshi


On Tue, Nov 7, 2017 at 8:28 AM, James Briggs <james.bri...@yahoo.com.invalid
> wrote:

> Nandan: The original Datastax training classes (when it was still called
> Riptano)
> used 3 virtualbox Debian instances to setup a Cassandra cluster.
>
> Thanks, James Briggs.
> --
> Cassandra/MySQL DBA. Available in San Jose area or remote.
> cass_top: https://github.com/jamesbriggs/cassandra-top
>
>
> --
> *From:* kurt greaves <k...@instaclustr.com>
> *To:* User <user@cassandra.apache.org>
> *Sent:* Monday, November 6, 2017 3:08 PM
> *Subject:* Re: Reg :- Multiple Node Cluster set up in Virtual Box
>
> Worth keeping in mind that in 3.6 onwards nodes will not start unless they
> can contact a seed. Not quite SPOF but still problematic. CASSANDRA-13851
> <https://issues.apache.org/jira/browse/CASSANDRA-13851>​
>
>
>


Re: Reg :- Multiple Node Cluster set up in Virtual Box

2017-11-06 Thread @Nandan@
Hi Jeff,
Thanks for you suggestion.
I have few questions here.
1) It is fine to set up first node and put node1 ip as seed , and then I
have to follow to set up node 2,3,4 respectively with same seed node as
node1 's ip address. but this will also comes as SPOF as if node1 will
close for some time being..
2) is it possible that ,after cluster set up with node1 's IP as seed and
cluster name as "ABC Cluster" and later I can change seed nodes like to add
node3 and node4 IP address in all 4 nodes.
3) if it is possible then , are there nay chances that cluster may get
disturbed due to changing seeds.
Please clarify these doubts.

Thanks and Best Regards,
Nandan Priyadarshi


On Nov 6, 2017 10:44 PM, "Jeff Jirsa" <jji...@gmail.com> wrote:

> Looks like official docs for first-time-setup are pretty lacking.
>
> One node at a time:
> - Install the deb package: http://cassandra.apache.org/doc/latest/getting_
> started/installing.html
> - Then configure - http://cassandra.apache.org/doc/latest/getting_started/
> configuring.html
>   - Pick a cluster name
>   - Set the listen_address (and maybe broadcast_address)
>   - Put the IP of the first node as the seed.
> - Start the node
> - Wait 2 minutes and then proceed to the next one.
>
>
>
>
>
> On Mon, Nov 6, 2017 at 6:33 AM, @Nandan@ <nandanpriyadarshi...@gmail.com>
> wrote:
>
>> Hi Varun ,
>> I tried CCM , but as for practice and for deep learning , finally I
>> understood that CCM is not a good way to go along.
>> Like my goal is to learn about configuration aspects as well as to know
>> in details about administration parts.
>> So I am trying to do configure all 4 virtual boxs as 4 nodes.
>> Thanks for reply. Hope we will work on this .
>>
>>
>> Thanks,
>> Nandan Priyadarshi
>>
>> On Nov 6, 2017 10:29 PM, "Varun Barala" <varunbaral...@gmail.com> wrote:
>>
>>> you can try *CCM*
>>> https://academy.datastax.com/planet-cassandra/getting-starte
>>> d-with-ccm-cassandra-cluster-manager
>>>
>>> Thanks
>>>
>>> On Mon, Nov 6, 2017 at 10:12 PM, @Nandan@ <nandanpriyadarshi...@gmail.co
>>> m> wrote:
>>>
>>>> Hi Users ,
>>>>  Just seeking some perfect guidelines to set up multi-node cluster
>>>> single Data Center in single host machine.
>>>> I am currently using windows 10 as host machine and installed Oracle
>>>> virtual box in which I created 4 virtual machines and all had Ubuntu 16.04
>>>> I would like to configure a flexible robust no SPOF  data center.
>>>> So please let me know how do I start and what steps, I have to follow
>>>> to configure this multi node cluster?
>>>> My goal is to create 4 node cluster now and later based on learning
>>>> experiences I will remove 1 node and add 2 more nodes to check everything
>>>> should be working perfectly.
>>>>
>>>> Just hope to get some step by step guidelines from all of you.
>>>>
>>>> Thanks in advance and best regards,
>>>> Nandan Priyadarshi
>>>>
>>>
>>>
>


Re: Reg :- Multiple Node Cluster set up in Virtual Box

2017-11-06 Thread @Nandan@
Hi Varun ,
I tried CCM , but as for practice and for deep learning , finally I
understood that CCM is not a good way to go along.
Like my goal is to learn about configuration aspects as well as to know in
details about administration parts.
So I am trying to do configure all 4 virtual boxs as 4 nodes.
Thanks for reply. Hope we will work on this .


Thanks,
Nandan Priyadarshi

On Nov 6, 2017 10:29 PM, "Varun Barala" <varunbaral...@gmail.com> wrote:

> you can try *CCM*
> https://academy.datastax.com/planet-cassandra/getting-
> started-with-ccm-cassandra-cluster-manager
>
> Thanks
>
> On Mon, Nov 6, 2017 at 10:12 PM, @Nandan@ <nandanpriyadarshi...@gmail.com>
> wrote:
>
>> Hi Users ,
>>  Just seeking some perfect guidelines to set up multi-node cluster
>> single Data Center in single host machine.
>> I am currently using windows 10 as host machine and installed Oracle
>> virtual box in which I created 4 virtual machines and all had Ubuntu 16.04
>> I would like to configure a flexible robust no SPOF  data center.
>> So please let me know how do I start and what steps, I have to follow to
>> configure this multi node cluster?
>> My goal is to create 4 node cluster now and later based on learning
>> experiences I will remove 1 node and add 2 more nodes to check everything
>> should be working perfectly.
>>
>> Just hope to get some step by step guidelines from all of you.
>>
>> Thanks in advance and best regards,
>> Nandan Priyadarshi
>>
>
>


Reg :- Multiple Node Cluster set up in Virtual Box

2017-11-06 Thread @Nandan@
Hi Users ,
 Just seeking some perfect guidelines to set up multi-node cluster  single
Data Center in single host machine.
I am currently using windows 10 as host machine and installed Oracle
virtual box in which I created 4 virtual machines and all had Ubuntu 16.04
I would like to configure a flexible robust no SPOF  data center.
So please let me know how do I start and what steps, I have to follow to
configure this multi node cluster?
My goal is to create 4 node cluster now and later based on learning
experiences I will remove 1 node and add 2 more nodes to check everything
should be working perfectly.

Just hope to get some step by step guidelines from all of you.

Thanks in advance and best regards,
Nandan Priyadarshi


Reg:- Install / Configure Cassandra on 2 DCs with 3 nodes

2017-09-19 Thread @Nandan@
Hi Techies,

I need to configure Apache Cassandra for my upcoming project on 2 DCs.
Both DCs should have 3 nodes respective.
Details are :-
DC1 nodes --
Node 1 ->10.0.0.1
Node 2 -> 10.0.0.2
Node 3 -> 10.0.0.3
DC2 nodes --
Node 1 -> 10.0.0.4
Node 2 -> 10.0.0.5
Node 3 -> 10.0.0.6

On all nodes , I want to use UBUNTU 16.04 .
Please suggest best way to configure my DCs as well as may be in Future I
will extend my DCs more.

Best Regards,
Nandan Priyadarshi


Re: why the cluster does not work well after addding two new nodes

2017-08-29 Thread @Nandan@
Hi ,
What happened wrong from starting, I am just listing down:-
1) Had 2 nodes servers but created Keyspace with RF 3. [Always make sure RF
<= Total No. of Nodes]
2) While Adding New Nodes, Make sure that Auto_bootstraping is Enable or
not.
3) Once You added 2 new nodes, better things will be you have to do node
rebalance.
There are 2 different way by which you can do rebalance.
A) Use OpsCenter -> And select Rebalance Cluster.
B) Use onnodetool cleanup that node afterward to clean up data no longer
belonging to that node.

Best Regards,
Nandan


On Wed, Aug 30, 2017 at 12:14 PM, 赵豫峰 <zha...@easemob.com> wrote:

> Hi, I have a cluster with two node servers(I know it’s in a wrong way  but
> it‘s builded by another colleague who has left), and it's keyspace set like:
>
> CREATE KEYSPACE my_keyspace WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': '3'}  AND durable_writes = true;
>
> one day my boss said one node was down for a long time and another worked
> normally, tell my to restart the cluster.
>
> First, I make a snapshot from the working node;
> then, I check the data numbers with select count(*) cql statement, the
> result is more then 17;
> Next, I add two new nodes. After new node worked, I use select
> count(*)  cql to check the data several times, but now I get uncertain
> resluts, and each reslut is less then 1; I check node status with
> ./nodetool status cql, and every node is UN, but the load of two new nodes
> is far less then the normal node。
> I stop the two new nodes, use “select count(*)” cql and get the right
> result again.
>
> I build a new cluster in sandbox env with snapshot file, and get the same
> result like above。 I used "./nodetool repair" sql,then the cluster works
> well but I don't know why.
>
> I guess it because two nodes with "replication = {'class':
> 'SimpleStrategy', 'replication_factor': '3'} " can make splite brain and
> the data won't be consistent,or the data file is broken but not make
> sure。Why did it happen, why I have to use "./nodetool repair" command, and
> when to use it?
>
> Thanks!
>
>
>
>
> --
> 赵豫峰
>
> 环信即时通讯云/研发
>
>


Re: Help in c* Data modelling

2017-07-23 Thread @Nandan@
Hi ,
The best way will go with per query per table plan.. and distribute the
common column into both tables.
This will help you to support queries as well as Read and Write will be
fast.
Only Drawback will be, you have to insert common data into both tables at
the same time which can be easily handled by the Client side.

On Mon, Jul 24, 2017 at 6:10 AM, Jonathan Haddad  wrote:

> Using a different table to answer each query is the correct answer here
> assuming there's a significant amount of data.
>
> If you don't have that much data, maybe you should consider using a
> database like Postgres which gives you query flexibility instead of
> horizontal scalability.
> On Sun, Jul 23, 2017 at 1:10 PM techpyaasa .  wrote:
>
>> Hi vladyu/varunbarala
>>
>> Instead of creating second table as you said can I just have one(first)
>> table below and get all rows with status=0.
>>
>> CREATE TABLE IF NOT EXISTS test.user ( account_id bigint, pid bigint, 
>> disp_name text, status int, PRIMARY KEY (account_id, pid) ) WITH CLUSTERING 
>> ORDER BY (pid ASC);
>>>
>>
>> I mean get all rows within same partition(account_id) whose status=0(say 
>> some value) using *UDF/UDA* in c* ?
>>
>>>
>>> select group_by_status from test.user;
>>
>>
>> where group_by_status is UDA/UDF
>>
>>
>> Thanks in advance
>> TechPyaasa
>>
>>
>> On Sun, Jul 23, 2017 at 10:42 PM, Vladimir Yudovin 
>> wrote:
>>
>>> Hi,
>>>
>>> unfortunately ORDER BY is supported for clustering columns only...
>>>
>>> *Winguzone  - Cloud Cassandra Hosting*
>>>
>>>
>>>  On Sun, 23 Jul 2017 12:49:36 -0400 *techpyaasa .
>>> >* wrote 
>>>
>>> Hi Varun,
>>>
>>> Thanks a lot for your reply.
>>>
>>> In this case if I want to update status(status can be updated for given
>>> account_id, pid) , I need to delete existing row in 2nd table & add new
>>> one...  :( :(
>>>
>>> Its like hitting cassandra twice for 1 change.. :(
>>>
>>>
>>>
>>> On Sun, Jul 23, 2017 at 8:42 PM, Varun Barala 
>>> wrote:
>>>
>>> Hi,
>>> You can create pseudo index table.
>>>
>>> IMO, structure can be:-
>>>
>>>
>>> CREATE TABLE IF NOT EXISTS test.user ( account_id bigint, pid bigint, 
>>> disp_name text, status int, PRIMARY KEY (account_id, pid) ) WITH CLUSTERING 
>>> ORDER BY (pid ASC);
>>> CREATE TABLE IF NOT EXISTS test.user_index ( account_id bigint, pid bigint, 
>>> disp_name text, status int, PRIMARY KEY ((account_id, status), disp_name) ) 
>>> WITH CLUSTERING ORDER BY (disp_name ASC);
>>>
>>> to support query *:-  select * from site24x7.wm_current_status where
>>> uid=1 order by dispName asc;*
>>> You can use *in condition* on last partition key *status *in table
>>> *test.user_index.*
>>>
>>>
>>> *It depends on your use case and amount of data as well. It can be
>>> optimized more...*
>>> Thanks!!
>>>
>>> On Sun, Jul 23, 2017 at 2:48 AM, techpyaasa . 
>>> wrote:
>>>
>>> Hi ,
>>>
>>> We have a table like below :
>>>
>>> CREATE TABLE ks.cf ( accountId bigint, pid bigint, dispName text,
>>> status int, PRIMARY KEY (accountId, pid) ) WITH CLUSTERING ORDER BY (pid
>>> ASC);
>>>
>>>
>>>
>>> We would like to have following queries possible on the above table:
>>>
>>> select * from site24x7.wm_current_status where uid=1 and mid=1;
>>> select * from site24x7.wm_current_status where uid=1 order by dispName
>>> asc;
>>> select * from site24x7.wm_current_status where uid=1 and status=0 order
>>> by dispName asc;
>>>
>>> I know first query is possible by default , but I want the last 2
>>> queries also to work.
>>>
>>> So can some one please let me know how can I achieve the same in
>>> cassandra(c*-2.1.17). I'm ok with applying indexes etc,
>>>
>>> Thanks
>>> TechPyaasa
>>>
>>>
>>>
>>


Reg:- Data Modelling Conceptual [DISCUSS]

2017-06-22 Thread @Nandan@
Hi All,

I am working on the data model. Just want to discuss based on below
conditions with valid pros and cons.
Requirment:-
1) User Registration Module
1.1) Multi types of Users such as Buyer, Seller.
1.2) Registration pages are different for different types of users which
contain different numbers of Columns.
Note: below - Single Mobile / Single Email Allowed.
2) User Login Module
2.1) Login Can be done by Email.
2.2) Login Can be done by Mobile Number.

If we can go with the denormalized method as we do in Cassandra, then we
have to create tables.
1) users-> PRIMARY KEY(userid) Type UUID
2) user_by_email-> PRIMARY KEY(email,userid)
3) user_by_mobile   -> PRIMARY KEY(mobile,userid)

Currently, I am using 2 DCs and 4 nodes on Each DC. So Is it a good way to
create a single USERS tables and then do SASI on "email" and "mobile"
columns.
What will be pros and cons comes?
Please give your best suggestions.
Thanks.


Re: LIKE

2017-06-21 Thread @Nandan@
If you are sure , that you want to do LIKE , then you can go with SASI .

https://docs.datastax.com/en/dse/5.1/cql/cql/cql_using/useSASIIndex.html
Hope this will help you .

On Wed, Jun 21, 2017 at 2:44 PM, web master  wrote:

> I have this table
>
> CREATE TABLE users_by_username (
> username text PRIMARY KEY,
> email text,
> age int
> )
>
> I want to run query like the following
>
> select username from users where username LIKE 'shl%' LIMIT 10;
>
>
> Always , I want to find only 10 username (Case insensitive) that start
> with specific characters , How can I do it efficient? I want to read
> minimum partitions and best performance
>
>


Re: Secondary Index

2017-06-20 Thread @Nandan@
Hi ,
Better you can go with denormalized the data based on status.
create table ks1.sta1(status int,id1 bigint,id2 binint,resp text,primary
key(status,id1));
This will allow you to do as you want..
select * from ks1.sta1 where status = 0 and id1 = 123;

Please make sure, that (status and id1) gives uniqueness to your data.

On Tue, Jun 20, 2017 at 3:29 PM, techpyaasa .  wrote:

> Hi ZAIDI,
>
> Thanks for reply.
> Sorry I didn't get your line
> "You can get away the potential situation by leveraging composite key, if
> that is possible for you?"
>
> How can I get through it??
>
> Like I have a table as below
> CREATE TABLE ks1.cf1 (id1 bigint, id2 bigint, resp text, status int,
> PRIMARY KEY (id1, id2)
> ) WITH CLUSTERING ORDER BY (id2 ASC)
>
> 'status' will have values of 0/1/2/3/4 (4 possible values) , insertions to
> table(partition) will happen based on id2 i.e values(id1,id2,resp,status)
>
> I want to have a filtering/criteria applied on 'status' column too like
> select * from ks1.cf1 where id1=123 and status=0;
>
> How can I achieve this w/o secondary index (on 'status' column )??
>
>
> On Tue, Jun 20, 2017 at 12:09 AM, ZAIDI, ASAD A  wrote:
>
>> If you’re only creating index so that your query work, think again!
>> You’ll be storing secondary index on each node , queries involving index
>> could create issues (slowness!!) down the road the when index on multiple
>> node Is involved and  not maintained!  Tables involving a lot of
>> inserts/delete could easily ruin index performance.
>>
>>
>>
>> You can get away the potential situation by leveraging composite key, if
>> that is possible for you?
>>
>>
>>
>>
>>
>> *From:* techpyaasa . [mailto:techpya...@gmail.com]
>> *Sent:* Monday, June 19, 2017 1:01 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Secondary Index
>>
>>
>>
>> Hi,
>>
>> I want to create Index on already existing table which has more than 3
>> GB/node.
>> We are using c*-2.1.17 with 2 DCs , each DC with 3 groups and each group
>> has 7 nodes.(Total 42 nodes in cluster)
>>
>> So is it ok to create Index on this table now or will it have any problem?
>> If its ok , how much time it would take for this process?
>>
>>
>> Thanks in advance,
>> TechPyaasa
>>
>
>


Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Yes I am not thinking to go with MV. I am trying to implement by myself.
May be some idea will get about doing cassandra-stress about data
generation and all.
Thanks Jonathan.

On Tue, Jun 13, 2017 at 10:44 AM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> Unless you're willing to put in a lot of time fixing bugs, I'd recommend
> avoiding 3.0's materialized views and manage them yourself.
>
> On Mon, Jun 12, 2017 at 6:11 PM @Nandan@ <nandanpriyadarshi...@gmail.com>
> wrote:
>
>> Correct, Our first concern is to store huge READ and WRITE, for that
>> Cassandra is our First and Best Choice. But according to Use Case, we need
>> to implement Advance search like Partial text, Phrase search etc.. So we
>> are thinking the best way, that how to implement data model.
>>
>>
>> On Tue, Jun 13, 2017 at 3:35 AM, Oskar Kjellin <oskar.kjel...@gmail.com>
>> wrote:
>>
>>> Agree, I meant as Jonathan said to use C* for primary key and as a
>>> primary storage and ES as an indexed version of what you have in cassandra.
>>>
>>> 2017-06-12 19:19 GMT+02:00 DuyHai Doan <doanduy...@gmail.com>:
>>>
>>>> Sorry, I misread some reply I had the impression that people recommend
>>>> ES as primary datastore
>>>>
>>>> On Mon, Jun 12, 2017 at 7:12 PM, Jonathan Haddad <j...@jonhaddad.com>
>>>> wrote:
>>>>
>>>>> Nobody is promoting ES as a primary datastore in this thread.  Every
>>>>> mention of it is to accompany C*.
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan <doanduy...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> For all those promoting ES as a PRIMARY datastore, please read this
>>>>>> before:
>>>>>>
>>>>>> https://discuss.elastic.co/t/elasticsearch-as-a-primary-
>>>>>> database/85733/13
>>>>>>
>>>>>> There are a lot of warning before recommending ES as a datastore.
>>>>>>
>>>>>> The answer from Pilato, ES official evangelist:
>>>>>>
>>>>>>
>>>>>>- You absolutely care about your data and you want to be able to
>>>>>>reindex in all cases. You need for that a datastore. A datastore can 
>>>>>> be a
>>>>>>filesystem where you store JSON, HDFS, and/or a database you prefer 
>>>>>> and you
>>>>>>are confident with. About how to inject data in it, you may want to 
>>>>>> read:
>>>>>>http://david.pilato.fr/blog/2015/05/09/advanced-
>>>>>>search-for-your-legacy-application/7
>>>>>>
>>>>>> <http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/>
>>>>>>.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 12, 2017 at 5:08 PM, Michael Mior <mm...@uwaterloo.ca>
>>>>>> wrote:
>>>>>>
>>>>>>> For queries 1-5 this seems like a potentially good use case for
>>>>>>> materialized views. Create one table with the videos stored by ID and 
>>>>>>> the
>>>>>>> materialized views for each of the queries.
>>>>>>>
>>>>>>> --
>>>>>>> Michael Mior
>>>>>>> mm...@apache.org
>>>>>>>
>>>>>>>
>>>>>>> 2017-06-11 22:40 GMT-04:00 @Nandan@ <nandanpriyadarshi...@gmail.com>
>>>>>>> :
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Currently, I am working on data modeling for Video Company in which
>>>>>>>> we have different types of users as well as different user 
>>>>>>>> functionality.
>>>>>>>> But currently, my concern is about Search video module based on
>>>>>>>> different fields.
>>>>>>>>
>>>>>>>> Query patterns are as below:-
>>>>>>>> 1) Select video by actor.
>>>>>>>> 2) select video by producer.
>>>>>>>> 3) select video by music.
>>>>>>>> 4) select video by actor and producer.
>>>>>>>> 5) select video by actor and music.
>>>>>>>>
>>>>>>>> Note: - In short, We want to establish an advanced search module by
>>>>>>>> which we can search by anyway and get the desired results.
>>>>>>>>
>>>>>>>> During a search , we need partial search also such that if any user
>>>>>>>> can search "Harry" title, then we are able to give them result as all
>>>>>>>> videos whose
>>>>>>>>  title contains "Harry" at any location.
>>>>>>>>
>>>>>>>> As per my ideas, I have to create separate tables such as
>>>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>>>> tables. Otherwise,
>>>>>>>> is there any others way by which we can implement this search
>>>>>>>> module effectively.
>>>>>>>>
>>>>>>>> Please suggest.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>
>>


Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Correct, Our first concern is to store huge READ and WRITE, for that
Cassandra is our First and Best Choice. But according to Use Case, we need
to implement Advance search like Partial text, Phrase search etc.. So we
are thinking the best way, that how to implement data model.


On Tue, Jun 13, 2017 at 3:35 AM, Oskar Kjellin <oskar.kjel...@gmail.com>
wrote:

> Agree, I meant as Jonathan said to use C* for primary key and as a primary
> storage and ES as an indexed version of what you have in cassandra.
>
> 2017-06-12 19:19 GMT+02:00 DuyHai Doan <doanduy...@gmail.com>:
>
>> Sorry, I misread some reply I had the impression that people recommend ES
>> as primary datastore
>>
>> On Mon, Jun 12, 2017 at 7:12 PM, Jonathan Haddad <j...@jonhaddad.com>
>> wrote:
>>
>>> Nobody is promoting ES as a primary datastore in this thread.  Every
>>> mention of it is to accompany C*.
>>>
>>>
>>>
>>> On Mon, Jun 12, 2017 at 10:03 AM DuyHai Doan <doanduy...@gmail.com>
>>> wrote:
>>>
>>>> For all those promoting ES as a PRIMARY datastore, please read this
>>>> before:
>>>>
>>>> https://discuss.elastic.co/t/elasticsearch-as-a-primary-data
>>>> base/85733/13
>>>>
>>>> There are a lot of warning before recommending ES as a datastore.
>>>>
>>>> The answer from Pilato, ES official evangelist:
>>>>
>>>>
>>>>- You absolutely care about your data and you want to be able to
>>>>reindex in all cases. You need for that a datastore. A datastore can be 
>>>> a
>>>>filesystem where you store JSON, HDFS, and/or a database you prefer and 
>>>> you
>>>>are confident with. About how to inject data in it, you may want to 
>>>> read:
>>>>http://david.pilato.fr/blog/2015/05/09/advanced-search
>>>>-for-your-legacy-application/7
>>>>
>>>> <http://david.pilato.fr/blog/2015/05/09/advanced-search-for-your-legacy-application/>
>>>>.
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 5:08 PM, Michael Mior <mm...@uwaterloo.ca>
>>>> wrote:
>>>>
>>>>> For queries 1-5 this seems like a potentially good use case for
>>>>> materialized views. Create one table with the videos stored by ID and the
>>>>> materialized views for each of the queries.
>>>>>
>>>>> --
>>>>> Michael Mior
>>>>> mm...@apache.org
>>>>>
>>>>>
>>>>> 2017-06-11 22:40 GMT-04:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Currently, I am working on data modeling for Video Company in which
>>>>>> we have different types of users as well as different user functionality.
>>>>>> But currently, my concern is about Search video module based on
>>>>>> different fields.
>>>>>>
>>>>>> Query patterns are as below:-
>>>>>> 1) Select video by actor.
>>>>>> 2) select video by producer.
>>>>>> 3) select video by music.
>>>>>> 4) select video by actor and producer.
>>>>>> 5) select video by actor and music.
>>>>>>
>>>>>> Note: - In short, We want to establish an advanced search module by
>>>>>> which we can search by anyway and get the desired results.
>>>>>>
>>>>>> During a search , we need partial search also such that if any user
>>>>>> can search "Harry" title, then we are able to give them result as all
>>>>>> videos whose
>>>>>>  title contains "Harry" at any location.
>>>>>>
>>>>>> As per my ideas, I have to create separate tables such as
>>>>>> video_by_actor, video_by_producer etc.. and implement solr query on all
>>>>>> tables. Otherwise,
>>>>>> is there any others way by which we can implement this search module
>>>>>> effectively.
>>>>>>
>>>>>> Please suggest.
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>
>>>>>
>>>>
>>
>


Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Hi Michael ,
MV is also good option when we have to select based on equality search, but
here condition is to developing a model for advance partial search way.
And Also , In case of MV, suppose we have 2 DC with 3 Nodes on each DC then
MV will replicated data based on 6*6 times which will be another problem.


On Mon, Jun 12, 2017 at 11:08 PM, Michael Mior <mm...@uwaterloo.ca> wrote:

> For queries 1-5 this seems like a potentially good use case for
> materialized views. Create one table with the videos stored by ID and the
> materialized views for each of the queries.
>
> --
> Michael Mior
> mm...@apache.org
>
>
> 2017-06-11 22:40 GMT-04:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>
>> Hi,
>>
>> Currently, I am working on data modeling for Video Company in which we
>> have different types of users as well as different user functionality.
>> But currently, my concern is about Search video module based on different
>> fields.
>>
>> Query patterns are as below:-
>> 1) Select video by actor.
>> 2) select video by producer.
>> 3) select video by music.
>> 4) select video by actor and producer.
>> 5) select video by actor and music.
>>
>> Note: - In short, We want to establish an advanced search module by which
>> we can search by anyway and get the desired results.
>>
>> During a search , we need partial search also such that if any user can
>> search "Harry" title, then we are able to give them result as all videos
>> whose
>>  title contains "Harry" at any location.
>>
>> As per my ideas, I have to create separate tables such as video_by_actor,
>> video_by_producer etc.. and implement solr query on all tables. Otherwise,
>> is there any others way by which we can implement this search module
>> effectively.
>>
>> Please suggest.
>>
>> Best regards,
>>
>
>


Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Ok , Then let's try to implement and will check by using cassandra-stress
to check what will be performance.
I worked on another data model for book storage for my company, with same
situations like having 1 single table with 80 columns and primary key as
bookid uuid.  Implemented Solr on top of that.  That's why , I am try to
implement all possible best solution for upcoming projects.


On Mon, Jun 12, 2017 at 7:51 PM, Eduardo Alonso <eduardoalo...@stratio.com>
wrote:

> -Virtual tokens are not recommended when using SOLR or
> cassandra-lucene-index.
>
> If you use your table schema you will not have any problem with partition
> size because your table is *not* a WIDE row table (it does not have
> clustering keys)
> The limit for 1 record with those 15 or 20 columns must not be larger that
> 100MB. You will have enough.
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
> *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 12:36 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>
>> And due to single table videos, maybe it will go with around 15,20
>> columns, then we need to also think very carefully about partition sizes
>> also.
>>
>> On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ <nandanpriyadarshi...@gmail.com
>> > wrote:
>>
>>> Yes this is only Option I am also thinking like this as my second
>>> options. Before this I was thinking to do denormalize table based on search
>>> columns, but due to partial search this will be not that effective.
>>>
>>> Now suppose , if we are going with this single table as videos. and
>>> implemented with Solr/Lucene, then need to also care about num_tokens ?
>>>
>>>
>>> On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <
>>> eduardoalo...@stratio.com> wrote:
>>>
>>>> Using cassandra collections
>>>>
>>>> CREATE TABLE videos (
>>>> videoid uuid primary key,
>>>> title text,
>>>> actor list,
>>>> producer list,
>>>> release_date timestamp,
>>>> description text,
>>>> music text,
>>>> etc...
>>>> );
>>>>
>>>> When using collection you need to take care of its length. Collections
>>>> are designed to store
>>>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
>>>> a small amount of data
>>>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
>>>> .
>>>> 5/10 actors per movie is ok.
>>>>
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
>>>> *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 11:54 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>>>>
>>>>> So In short we have to go with one single table as videos and put
>>>>> primary key as videoid uuid.
>>>>> But then how can we able to handle multiple actor name and producer
>>>>> name. ?
>>>>>
>>>>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>>>>> eduardoalo...@stratio.com> wrote:
>>>>>
>>>>>> Yes, you are right.
>>>>>>
>>>>>> Table denormalization is useful just when you have unique primary
>>>>>> keys, not your case.
>>>>>> Denormalized tables are only different in its primary key, every
>>>>>> denormalized table contains all the data (it just change how it is
>>>>>> structured). So, if you need to index it, do it with just one table (the
>>>>>> one you showed us with videoid as the primary key is ok).
>>>>>>
>>>>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>>>>> all of them fulfill all your needs.
>>>>>>
>>>>>> Solr (in DSE) and cassandra-lucene-index
>>>>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>>>>> integrated with cassandra using its secondary index interface. If you
>>>>>> choose elastic search you will need to code the integration (write mutex,
>>>>>&

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
And due to single table videos, maybe it will go with around 15,20 columns,
then we need to also think very carefully about partition sizes also.

On Mon, Jun 12, 2017 at 6:33 PM, @Nandan@ <nandanpriyadarshi...@gmail.com>
wrote:

> Yes this is only Option I am also thinking like this as my second options.
> Before this I was thinking to do denormalize table based on search columns,
> but due to partial search this will be not that effective.
>
> Now suppose , if we are going with this single table as videos. and
> implemented with Solr/Lucene, then need to also care about num_tokens ?
>
>
> On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <eduardoalo...@stratio.com
> > wrote:
>
>> Using cassandra collections
>>
>> CREATE TABLE videos (
>> videoid uuid primary key,
>> title text,
>> actor list,
>> producer list,
>> release_date timestamp,
>> description text,
>> music text,
>> etc...
>> );
>>
>> When using collection you need to take care of its length. Collections
>> are designed to store
>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
>> a small amount of data
>> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
>> .
>> 5/10 actors per movie is ok.
>>
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
>> *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 11:54 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>>
>>> So In short we have to go with one single table as videos and put
>>> primary key as videoid uuid.
>>> But then how can we able to handle multiple actor name and producer
>>> name. ?
>>>
>>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>>> eduardoalo...@stratio.com> wrote:
>>>
>>>> Yes, you are right.
>>>>
>>>> Table denormalization is useful just when you have unique primary keys,
>>>> not your case.
>>>> Denormalized tables are only different in its primary key, every
>>>> denormalized table contains all the data (it just change how it is
>>>> structured). So, if you need to index it, do it with just one table (the
>>>> one you showed us with videoid as the primary key is ok).
>>>>
>>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>>> all of them fulfill all your needs.
>>>>
>>>> Solr (in DSE) and cassandra-lucene-index
>>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>>> integrated with cassandra using its secondary index interface. If you
>>>> choose elastic search you will need to code the integration (write mutex,
>>>> both cluster synchronization (imagine something written in cassandra but
>>>> failed to write in elastic))
>>>>
>>>> I know i am not the most suitable to recommend you to use our product
>>>> cassandra-lucene-index
>>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>>> source, just take a look.
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
>>>> *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>>>>
>>>>> Hi Eduardo,
>>>>>
>>>>> And As we are trying to build an advanced search functionality in
>>>>> which we can able to do partial search based on actor, producer, director,
>>>>> etc. columns.
>>>>> So if we do denormalization of tables then we have to create tables
>>>>> such as below :-
>>>>> video_by_actor
>>>>> video_by_producer
>>>>> video_by_director
>>>>> video_by_date
>>>>> etc..
>>>>> By using denormalized, Cassandra only allows us to do equality search,
>>>>> but for implementing Partial search we need to implement solr on all above
>>>>> tables.
>>>>>
>>>>> This is my thinking, but I think this will be not correct way to
>>>>> implement Apache So

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Yes this is only Option I am also thinking like this as my second options.
Before this I was thinking to do denormalize table based on search columns,
but due to partial search this will be not that effective.

Now suppose , if we are going with this single table as videos. and
implemented with Solr/Lucene, then need to also care about num_tokens ?


On Mon, Jun 12, 2017 at 6:27 PM, Eduardo Alonso <eduardoalo...@stratio.com>
wrote:

> Using cassandra collections
>
> CREATE TABLE videos (
> videoid uuid primary key,
> title text,
> actor list,
> producer list,
> release_date timestamp,
> description text,
> music text,
> etc...
> );
>
> When using collection you need to take care of its length. Collections
> are designed to store
> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>only
> a small amount of data
> <http://docs.datastax.com/en/cql/3.1/cql/cql_using/use_collections_c.html>
> .
> 5/10 actors per movie is ok.
>
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
> *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 11:54 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>
>> So In short we have to go with one single table as videos and put primary
>> key as videoid uuid.
>> But then how can we able to handle multiple actor name and producer name.
>> ?
>>
>> On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <
>> eduardoalo...@stratio.com> wrote:
>>
>>> Yes, you are right.
>>>
>>> Table denormalization is useful just when you have unique primary keys,
>>> not your case.
>>> Denormalized tables are only different in its primary key, every
>>> denormalized table contains all the data (it just change how it is
>>> structured). So, if you need to index it, do it with just one table (the
>>> one you showed us with videoid as the primary key is ok).
>>>
>>> Solr, Elastic and cassandra-lucene-index are both based on Lucene and
>>> all of them fulfill all your needs.
>>>
>>> Solr (in DSE) and cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index> are very well
>>> integrated with cassandra using its secondary index interface. If you
>>> choose elastic search you will need to code the integration (write mutex,
>>> both cluster synchronization (imagine something written in cassandra but
>>> failed to write in elastic))
>>>
>>> I know i am not the most suitable to recommend you to use our product
>>> cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index> but it is open
>>> source, just take a look.
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
>>> *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 11:18 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>>>
>>>> Hi Eduardo,
>>>>
>>>> And As we are trying to build an advanced search functionality in which
>>>> we can able to do partial search based on actor, producer, director, etc.
>>>> columns.
>>>> So if we do denormalization of tables then we have to create tables
>>>> such as below :-
>>>> video_by_actor
>>>> video_by_producer
>>>> video_by_director
>>>> video_by_date
>>>> etc..
>>>> By using denormalized, Cassandra only allows us to do equality search,
>>>> but for implementing Partial search we need to implement solr on all above
>>>> tables.
>>>>
>>>> This is my thinking, but I think this will be not correct way to
>>>> implement Apache Solr on all tables.
>>>>
>>>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <
>>>> nandanpriyadarshi...@gmail.com> wrote:
>>>>
>>>>> Hi Edurado,
>>>>>
>>>>> As you mentioned queries 1-6 ,
>>>>> In this condition, we have to proceed with a table like as below :-
>>>>> create table videos (
>>>>> videoid uuid primary key,
>>>>> title text,
>>>>> actor text,
>>>>> producer text,
>>>>> release_date timestamp,
>>>>> desc

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
So In short we have to go with one single table as videos and put primary
key as videoid uuid.
But then how can we able to handle multiple actor name and producer name. ?

On Mon, Jun 12, 2017 at 5:51 PM, Eduardo Alonso <eduardoalo...@stratio.com>
wrote:

> Yes, you are right.
>
> Table denormalization is useful just when you have unique primary keys,
> not your case.
> Denormalized tables are only different in its primary key, every
> denormalized table contains all the data (it just change how it is
> structured). So, if you need to index it, do it with just one table (the
> one you showed us with videoid as the primary key is ok).
>
> Solr, Elastic and cassandra-lucene-index are both based on Lucene and all
> of them fulfill all your needs.
>
> Solr (in DSE) and cassandra-lucene-index
> <https://github.com/stratio/cassandra-lucene-index> are very well
> integrated with cassandra using its secondary index interface. If you
> choose elastic search you will need to code the integration (write mutex,
> both cluster synchronization (imagine something written in cassandra but
> failed to write in elastic))
>
> I know i am not the most suitable to recommend you to use our product
> cassandra-lucene-index <https://github.com/stratio/cassandra-lucene-index>
> but it is open source, just take a look.
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
> *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 11:18 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>
>> Hi Eduardo,
>>
>> And As we are trying to build an advanced search functionality in which
>> we can able to do partial search based on actor, producer, director, etc.
>> columns.
>> So if we do denormalization of tables then we have to create tables such
>> as below :-
>> video_by_actor
>> video_by_producer
>> video_by_director
>> video_by_date
>> etc..
>> By using denormalized, Cassandra only allows us to do equality search,
>> but for implementing Partial search we need to implement solr on all above
>> tables.
>>
>> This is my thinking, but I think this will be not correct way to
>> implement Apache Solr on all tables.
>>
>> On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <nandanpriyadarshi...@gmail.com
>> > wrote:
>>
>>> Hi Edurado,
>>>
>>> As you mentioned queries 1-6 ,
>>> In this condition, we have to proceed with a table like as below :-
>>> create table videos (
>>> videoid uuid primary key,
>>> title text,
>>> actor text,
>>> producer text,
>>> release_date timestamp,
>>> description text,
>>> music text,
>>> etc...
>>> );
>>> This table will help to store video datas based on PK videoid and will
>>> give uniqeness due to uuid.
>>> But as we know , in one movie there are multiple actor, multiple
>>> producer, multiple music worked, So how can we store all these.. Only one
>>> option will left as to use collection type columns.
>>>
>>>
>>> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <
>>> eduardoalo...@stratio.com> wrote:
>>>
>>>> TLDR shouldBe *PD
>>>>
>>>> Eduardo Alonso
>>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>>> 28224 Pozuelo de Alarcón, Madrid
>>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
>>>> *@stratiobd
>>>> <https://twitter.com/StratioBD>*
>>>>
>>>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <eduardoalo...@stratio.com>:
>>>>
>>>>> Hi Nandan:
>>>>>
>>>>> So, your system must provide these queries:
>>>>>
>>>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>>>> 3 - SELECT video FROM ... WHERE music = '...';
>>>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>>>
>>>>>
>>>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>>>> tables just the way your mentioned but without solr, just cassandra
>>>>> (Indeed, just for equality clauses)
>>>>>
>>>>> video_b

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Hi Eduardo,

And As we are trying to build an advanced search functionality in which we
can able to do partial search based on actor, producer, director, etc.
columns.
So if we do denormalization of tables then we have to create tables such as
below :-
video_by_actor
video_by_producer
video_by_director
video_by_date
etc..
By using denormalized, Cassandra only allows us to do equality search, but
for implementing Partial search we need to implement solr on all above
tables.

This is my thinking, but I think this will be not correct way to implement
Apache Solr on all tables.

On Mon, Jun 12, 2017 at 5:11 PM, @Nandan@ <nandanpriyadarshi...@gmail.com>
wrote:

> Hi Edurado,
>
> As you mentioned queries 1-6 ,
> In this condition, we have to proceed with a table like as below :-
> create table videos (
> videoid uuid primary key,
> title text,
> actor text,
> producer text,
> release_date timestamp,
> description text,
> music text,
> etc...
> );
> This table will help to store video datas based on PK videoid and will
> give uniqeness due to uuid.
> But as we know , in one movie there are multiple actor, multiple producer,
> multiple music worked, So how can we store all these.. Only one option will
> left as to use collection type columns.
>
>
> On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <eduardoalo...@stratio.com
> > wrote:
>
>> TLDR shouldBe *PD
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
>> *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <eduardoalo...@stratio.com>:
>>
>>> Hi Nandan:
>>>
>>> So, your system must provide these queries:
>>>
>>> 1 - SELECT video FROM ... WHERE actor = '...';
>>> 2 - SELECT video FROM ... WHERE producer = '...';
>>> 3 - SELECT video FROM ... WHERE music = '...';
>>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>>
>>>
>>> For queries 1-5 you can get them with just cassandra, denormalizing
>>> tables just the way your mentioned but without solr, just cassandra
>>> (Indeed, just for equality clauses)
>>>
>>> video_by_actor;
>>> video_by_producer;
>>> video_by_music;
>>> video_by_actor_and_producer;
>>> video_by_actor_and_music;
>>>
>>> For queries number 6 you need a search engine.
>>>
>>> SOL
>>> ElasticSearch
>>> cassandra-lucene-index
>>> <https://github.com/stratio/cassandra-lucene-index>
>>> SASI
>>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>>
>>> I think, just for your query,  the easiest way to get it is to build a
>>> SASI index.
>>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>>> query (only one dimension), SASI indexes will work for you.
>>>
>>>
>>>
>>>
>>> Eduardo Alonso
>>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>>> 28224 Pozuelo de Alarcón, Madrid
>>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
>>> *@stratiobd
>>> <https://twitter.com/StratioBD>*
>>>
>>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>>>
>>>> But Condition is , I am working with Apache Cassandra Database in which
>>>> I have to store my data into Cassandra and then have to implement partial
>>>> search capability.
>>>> If we need to search based on full search  primary key, then it really
>>>> best and easy to work with Cassandra , but in case of flexible search , I
>>>> am getting confused.
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <oskar.kjel...@gmail.com
>>>> > wrote:
>>>>
>>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>>> elasticsearch as a completely separate service and write there as well.
>>>>>
>>>>> On 12 Jun 2017, at 09:45, @Nandan@ <nandanpriyadarshi...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Do you mean to use Elastic Search with Cassandra?
>>>>> Even I am thinking to use Apache Solr With Cassandra.

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Hi Edurado,

As you mentioned queries 1-6 ,
In this condition, we have to proceed with a table like as below :-
create table videos (
videoid uuid primary key,
title text,
actor text,
producer text,
release_date timestamp,
description text,
music text,
etc...
);
This table will help to store video datas based on PK videoid and will give
uniqeness due to uuid.
But as we know , in one movie there are multiple actor, multiple producer,
multiple music worked, So how can we store all these.. Only one option will
left as to use collection type columns.


On Mon, Jun 12, 2017 at 4:59 PM, Eduardo Alonso <eduardoalo...@stratio.com>
wrote:

> TLDR shouldBe *PD
>
> Eduardo Alonso
> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
> 28224 Pozuelo de Alarcón, Madrid
> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
> *@stratiobd
> <https://twitter.com/StratioBD>*
>
> 2017-06-12 10:58 GMT+02:00 Eduardo Alonso <eduardoalo...@stratio.com>:
>
>> Hi Nandan:
>>
>> So, your system must provide these queries:
>>
>> 1 - SELECT video FROM ... WHERE actor = '...';
>> 2 - SELECT video FROM ... WHERE producer = '...';
>> 3 - SELECT video FROM ... WHERE music = '...';
>> 4 - SELECT video FROM ... WHERE actor = '...' AND producer ='...';
>> 5 - SELECT video FROM ... WHERE actor = '...' AND music = '...';
>> 6 - SELECT video WHERE title CONTAINS 'Harry';
>>
>>
>> For queries 1-5 you can get them with just cassandra, denormalizing
>> tables just the way your mentioned but without solr, just cassandra
>> (Indeed, just for equality clauses)
>>
>> video_by_actor;
>> video_by_producer;
>> video_by_music;
>> video_by_actor_and_producer;
>> video_by_actor_and_music;
>>
>> For queries number 6 you need a search engine.
>>
>> SOL
>> ElasticSearch
>> cassandra-lucene-index
>> <https://github.com/stratio/cassandra-lucene-index>
>> SASI
>> <http://docs.datastax.com/en/dse/5.1/cql/cql/cql_reference/cql_commands/cqlCreateCustomIndex.html>
>>
>> I think, just for your query,  the easiest way to get it is to build a
>> SASI index.
>> TLDR: I work for stratio in cassandra-lucene-index but for your basic
>> query (only one dimension), SASI indexes will work for you.
>>
>>
>>
>>
>> Eduardo Alonso
>> Vía de las dos Castillas, 33, Ática 4, 3ª Planta
>> 28224 Pozuelo de Alarcón, Madrid
>> Tel: +34 91 828 6473 <+34%20918%2028%2064%2073> // www.stratio.com // 
>> *@stratiobd
>> <https://twitter.com/StratioBD>*
>>
>> 2017-06-12 9:50 GMT+02:00 @Nandan@ <nandanpriyadarshi...@gmail.com>:
>>
>>> But Condition is , I am working with Apache Cassandra Database in which
>>> I have to store my data into Cassandra and then have to implement partial
>>> search capability.
>>> If we need to search based on full search  primary key, then it really
>>> best and easy to work with Cassandra , but in case of flexible search , I
>>> am getting confused.
>>>
>>>
>>> On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <oskar.kjel...@gmail.com>
>>> wrote:
>>>
>>>> I haven't run solr with Cassandra myself. I just meant to run
>>>> elasticsearch as a completely separate service and write there as well.
>>>>
>>>> On 12 Jun 2017, at 09:45, @Nandan@ <nandanpriyadarshi...@gmail.com>
>>>> wrote:
>>>>
>>>> Do you mean to use Elastic Search with Cassandra?
>>>> Even I am thinking to use Apache Solr With Cassandra.
>>>> In that case I have to create distributed tables such as:-
>>>> 1) video_by_title, video_by_actor, video_by_year  etc..
>>>> 2) After creating Tables , will have to configure solr core on all
>>>> tables.
>>>>
>>>> Is it like this ?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <oskar.kjel...@gmail.com
>>>> > wrote:
>>>>
>>>>> Why not elasticsearch for this use case? It will make your life much
>>>>> simpler
>>>>>
>>>>> > On 12 Jun 2017, at 04:40, @Nandan@ <nandanpriyadarshi...@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > Hi,
>>>>> >
>>>>> > Currently, I am working on data modeling for Video Company in which
>>>>> we have different types of users as well as different user functionality.
>>>>> > But curr

Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
But Condition is , I am working with Apache Cassandra Database in which I
have to store my data into Cassandra and then have to implement partial
search capability.
If we need to search based on full search  primary key, then it really best
and easy to work with Cassandra , but in case of flexible search , I am
getting confused.


On Mon, Jun 12, 2017 at 3:47 PM, Oskar Kjellin <oskar.kjel...@gmail.com>
wrote:

> I haven't run solr with Cassandra myself. I just meant to run
> elasticsearch as a completely separate service and write there as well.
>
> On 12 Jun 2017, at 09:45, @Nandan@ <nandanpriyadarshi...@gmail.com> wrote:
>
> Do you mean to use Elastic Search with Cassandra?
> Even I am thinking to use Apache Solr With Cassandra.
> In that case I have to create distributed tables such as:-
> 1) video_by_title, video_by_actor, video_by_year  etc..
> 2) After creating Tables , will have to configure solr core on all tables.
>
> Is it like this ?
>
>
>
>
>
> On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <oskar.kjel...@gmail.com>
> wrote:
>
>> Why not elasticsearch for this use case? It will make your life much
>> simpler
>>
>> > On 12 Jun 2017, at 04:40, @Nandan@ <nandanpriyadarshi...@gmail.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > Currently, I am working on data modeling for Video Company in which we
>> have different types of users as well as different user functionality.
>> > But currently, my concern is about Search video module based on
>> different fields.
>> >
>> > Query patterns are as below:-
>> > 1) Select video by actor.
>> > 2) select video by producer.
>> > 3) select video by music.
>> > 4) select video by actor and producer.
>> > 5) select video by actor and music.
>> >
>> > Note: - In short, We want to establish an advanced search module by
>> which we can search by anyway and get the desired results.
>> >
>> > During a search , we need partial search also such that if any user can
>> search "Harry" title, then we are able to give them result as all videos
>> whose
>> >  title contains "Harry" at any location.
>> >
>> > As per my ideas, I have to create separate tables such as
>> video_by_actor, video_by_producer etc.. and implement solr query on all
>> tables. Otherwise,
>> > is there any others way by which we can implement this search module
>> effectively.
>> >
>> > Please suggest.
>> >
>> > Best regards,
>>
>
>


Re: Reg:- Cassandra Data modelling for Search

2017-06-12 Thread @Nandan@
Do you mean to use Elastic Search with Cassandra?
Even I am thinking to use Apache Solr With Cassandra.
In that case I have to create distributed tables such as:-
1) video_by_title, video_by_actor, video_by_year  etc..
2) After creating Tables , will have to configure solr core on all tables.

Is it like this ?





On Mon, Jun 12, 2017 at 3:19 PM, Oskar Kjellin <oskar.kjel...@gmail.com>
wrote:

> Why not elasticsearch for this use case? It will make your life much
> simpler
>
> > On 12 Jun 2017, at 04:40, @Nandan@ <nandanpriyadarshi...@gmail.com>
> wrote:
> >
> > Hi,
> >
> > Currently, I am working on data modeling for Video Company in which we
> have different types of users as well as different user functionality.
> > But currently, my concern is about Search video module based on
> different fields.
> >
> > Query patterns are as below:-
> > 1) Select video by actor.
> > 2) select video by producer.
> > 3) select video by music.
> > 4) select video by actor and producer.
> > 5) select video by actor and music.
> >
> > Note: - In short, We want to establish an advanced search module by
> which we can search by anyway and get the desired results.
> >
> > During a search , we need partial search also such that if any user can
> search "Harry" title, then we are able to give them result as all videos
> whose
> >  title contains "Harry" at any location.
> >
> > As per my ideas, I have to create separate tables such as
> video_by_actor, video_by_producer etc.. and implement solr query on all
> tables. Otherwise,
> > is there any others way by which we can implement this search module
> effectively.
> >
> > Please suggest.
> >
> > Best regards,
>


Reg:- Cassandra Data modelling for Search

2017-06-11 Thread @Nandan@
Hi,

Currently, I am working on data modeling for Video Company in which we have
different types of users as well as different user functionality.
But currently, my concern is about Search video module based on different
fields.

Query patterns are as below:-
1) Select video by actor.
2) select video by producer.
3) select video by music.
4) select video by actor and producer.
5) select video by actor and music.

Note: - In short, We want to establish an advanced search module by which
we can search by anyway and get the desired results.

During a search , we need partial search also such that if any user can
search "Harry" title, then we are able to give them result as all videos
whose
 title contains "Harry" at any location.

As per my ideas, I have to create separate tables such as video_by_actor,
video_by_producer etc.. and implement solr query on all tables. Otherwise,
is there any others way by which we can implement this search module
effectively.

Please suggest.

Best regards,


Re: Reg:- Data Modelling For Hierarchy Data

2017-06-09 Thread @Nandan@
MV is a good option, but in case of RF, suppose If we are using RF =3 then
it will duplicate on 3*3 times which will be unwanted in case we insert a
lot of users.
But I think we can go with MV also.


On Fri, Jun 9, 2017 at 4:41 PM, Jacques-Henri Berthemet <
jacques-henri.berthe...@genesys.com> wrote:

> For query 2) you should have a second table, secondary index is usually
> never recommended. If you’re planning to use Cassandra 3.x you should take
> a look at materialized views (MVs):
>
> http://cassandra.apache.org/doc/latest/cql/mvs.html
>
> https://opencredo.com/everything-need-know-cassandra-materialized-views/
>
>
>
> I don’t have experience on MVs, I’m stuck on 2.2 for now.
>
>
>
> Regards,
>
> *--*
>
> *Jacques-Henri Berthemet*
>
>
>
> *From:* @Nandan@ [mailto:nandanpriyadarshi...@gmail.com]
> *Sent:* vendredi 9 juin 2017 10:27
> *To:* Jacques-Henri Berthemet <jacques-henri.berthe...@genesys.com>
> *Cc:* user@cassandra.apache.org
> *Subject:* Re: Reg:- Data Modelling For Hierarchy Data
>
>
>
> Hi,
>
> Yes, I am following with single Users table.
>
> Suppose my query patterns are:-
>
> 1) Select user by email.
>
> 2) Select user by user_type
>
> 1st query pattern will satisfy the Users table, but in the case of second
> query pattern, either have to go with another table like user_by_type or I
> have to create secondary index on user_type by which client will able to
> access Only Buyer or Seller Records.
>
>
>
> Please suggest the best way.
>
> Best Regards.
>
> Nandan
>
>
>
> On Fri, Jun 9, 2017 at 3:59 PM, Jacques-Henri Berthemet <
> jacques-henri.berthe...@genesys.com> wrote:
>
> Hi,
>
>
>
> According to your model a use can only be of one type, so I’d go with a
> very simple model with a single table:
>
>
>
> string email (PK), string user_type, map<string, string> attributes
>
>
>
> user_type can be Buyer, Master_Seller, Slave_Seller and all other columns
> go into attribute map as long as all of them don’t exceed 64k, but you
> could create dedicate columns for all attributes that you know will always
> be there.
>
>
>
> *--*
>
> *Jacques-Henri Berthemet*
>
>
>
> *From:* @Nandan@ [mailto:nandanpriyadarshi...@gmail.com]
> *Sent:* vendredi 9 juin 2017 03:14
> *To:* user@cassandra.apache.org
> *Subject:* Reg:- Data Modelling For Hierarchy Data
>
>
>
> Hi,
>
>
>
> I am working on Music database where we have multiple order of users of
> our portal. Different category of users is having some common attributes
> but some different attributes based on their registration.
>
> This becomes a hierarchy pattern. I am attaching one sample hierarchy
> pattern of User Module which is somehow part of my current data modeling.
>
>
>
> *There are few conditions:-*
>
> *1) email id should be unique. i.e If some user registered with one email
> id then that particular user can't able to register as another user. *
>
> *2) Some type of users having 20-30 columns as in their registration. such
> as company,address,email,first_name,join_date etc..*
>
>
>
> *Query pattern is like:-*
>
> *1) select user by email*
>
>
>
> Please suggest me how to do data modeling for these type of
> hierarchy data.
>
> Should I create a seperate table for the seperate type of users or should
> I go with single user table?
>
> As we have unique email id condition, so should I go with email id as a
> primary key or user_id UUID will be the best choice.
>
>
>
>
>
>
>
> Best regards,
>
> Nandan Priyadarshi
>
>
>


Re: Reg:- Data Modelling For Hierarchy Data

2017-06-09 Thread @Nandan@
Hi,
Yes, I am following with single Users table.
Suppose my query patterns are:-
1) Select user by email.
2) Select user by user_type
1st query pattern will satisfy the Users table, but in the case of second
query pattern, either have to go with another table like user_by_type or I
have to create secondary index on user_type by which client will able to
access Only Buyer or Seller Records.

Please suggest the best way.
Best Regards.
Nandan

On Fri, Jun 9, 2017 at 3:59 PM, Jacques-Henri Berthemet <
jacques-henri.berthe...@genesys.com> wrote:

> Hi,
>
>
>
> According to your model a use can only be of one type, so I’d go with a
> very simple model with a single table:
>
>
>
> string email (PK), string user_type, map<string, string> attributes
>
>
>
> user_type can be Buyer, Master_Seller, Slave_Seller and all other columns
> go into attribute map as long as all of them don’t exceed 64k, but you
> could create dedicate columns for all attributes that you know will always
> be there.
>
>
>
> *--*
>
> *Jacques-Henri Berthemet*
>
>
>
> *From:* @Nandan@ [mailto:nandanpriyadarshi...@gmail.com]
> *Sent:* vendredi 9 juin 2017 03:14
> *To:* user@cassandra.apache.org
> *Subject:* Reg:- Data Modelling For Hierarchy Data
>
>
>
> Hi,
>
>
>
> I am working on Music database where we have multiple order of users of
> our portal. Different category of users is having some common attributes
> but some different attributes based on their registration.
>
> This becomes a hierarchy pattern. I am attaching one sample hierarchy
> pattern of User Module which is somehow part of my current data modeling.
>
>
>
> *There are few conditions:-*
>
> *1) email id should be unique. i.e If some user registered with one email
> id then that particular user can't able to register as another user. *
>
> *2) Some type of users having 20-30 columns as in their registration. such
> as company,address,email,first_name,join_date etc..*
>
>
>
> *Query pattern is like:-*
>
> *1) select user by email*
>
>
>
> Please suggest me how to do data modeling for these type of
> hierarchy data.
>
> Should I create a seperate table for the seperate type of users or should
> I go with single user table?
>
> As we have unique email id condition, so should I go with email id as a
> primary key or user_id UUID will be the best choice.
>
>
>
>
>
>
>
> Best regards,
>
> Nandan Priyadarshi
>


Reg:- Data Modelling For Hierarchy Data

2017-06-08 Thread @Nandan@
Hi,

I am working on Music database where we have multiple order of users of our
portal. Different category of users is having some common attributes but
some different attributes based on their registration.
This becomes a hierarchy pattern. I am attaching one sample hierarchy
pattern of User Module which is somehow part of my current data modeling.

*There are few conditions:-*
*1) email id should be unique. i.e If some user registered with one email
id then that particular user can't able to register as another user. *
*2) Some type of users having 20-30 columns as in their registration. such
as company,address,email,first_name,join_date etc..*

*Query pattern is like:-*
*1) select user by email*

Please suggest me how to do data modeling for these type of hierarchy data.
Should I create a seperate table for the seperate type of users or should I
go with single user table?
As we have unique email id condition, so should I go with email id as a
primary key or user_id UUID will be the best choice.



Best regards,
Nandan Priyadarshi

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org

Reg:- Multi DC Configuration

2017-06-06 Thread @Nandan@
Hi ,

I am trying to Setup Cassandra 3.9 on Multi DC.
Currently, I am having 2 DCs with 3 and 2 nodes respectively.

DC1 Name :- India
Nodes :- 192.16.0.1 , 192.16.0.2, 192.16.0.3
DC2 Name :- USA
Nodes :- 172.16.0.1 , 172.16.0.2

Please help me to know which files I need to make changes for configuring
Multi DC successfully.

I am using Ubuntu 16.04 Operating System.

Thanks and Best Regards,
Nandan Priyadarshi


Reg:- Generate dummy data for Cassandra Tables

2017-06-04 Thread @Nandan@
Hi All,

As I am creating oneself project for Cassandra Project in which I want to
insert some random dummy data into my tables.
Please tell me how to do this as in distributive nature?
Below is just an example :-
Table1 :- videos
create table videos(videoid uuid PRIMARY KEY,title text,tag text,added_time
timestamp,year int);
Table2 :- videos_by_actor
create table videos_by_actor(actor text,uploaded_time timestamp,videoid
uuid,userid uuid,title text,primary key (actor,uploaded_time,videoid));
Table3:- users
create table users(userid uuid,first_name text,last_name text,email
text,join_year int,primary key(userid));
Table4:- user_login
create table user_login(email text primary key,password text,userid uuid);

Please help me to know how to generate dummy data and insert into all
tables which satisfy the distributive nature.
Thanks.
Nandan


Reg:- Data Modelling Documentation

2017-05-19 Thread @Nandan@
Hi Team,
Just as Information, When Data modeling Document will be published on
official link.
Waiting for its so long time.

Please update the document. Currently no any documents present. And please
put document, which can be understood by the absolute beginner as well as
others also.

Thanks in Advance.
Best Regards,
Nandan Priyadarshi


Re: Reg:- Data Modelling Concepts

2017-05-16 Thread @Nandan@
Hi Jon,

We need to keep tracking of all updates like 'User' of our platform can
check what changes made before.
I am thinking in this way..
CREATE TABLE book_info (
book_id uuid,
book_title text,
author_name text,
updated_at timestamp,
PRIMARY KEY(book_id));
This table will contain details about all book with unique updated details.
CREATE TABLE book_title_by_user(
book_title text,
book_id uuid,
user_id uuid ,
ts timeuuid,
primary key(book_title,book_id,user_id,ts));
This table wil contain details of multiple old updates of book which can be
done by mulplie users like MANY TO MANY .

What do you think on this?

On Wed, May 17, 2017 at 9:44 AM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> I don't understand why you need to store the old value a second time.  If
> you know that the value went from A -> B -> C, just store the new value,
> not the old.  You can see that it changed from A->B->C without storing it
> twice.
>
> On Tue, May 16, 2017 at 6:36 PM @Nandan@ <nandanpriyadarshi...@gmail.com>
> wrote:
>
>> The requirement is to create DB in which we have to keep data of Updated
>> values as well as which user update the particular book details and what
>> they update.
>>
>> We are like to create a schema which store book info, as well as the
>> history of the update, made based on book_title, author, publisher, price
>> changed.
>> Like we want to store what was old data and what new data updated.. and
>> also want to check which user updated the relevant change. Because suppose
>> if some changes not made correctly then they can check changes and revert
>> based on old values.
>> We are trying to make a USER based Schema.
>>
>> For example:-
>> id:- 1
>> Name: - Harry Poter
>> Author : - JK Rolling
>>
>> New Update Done by user_id 2:-
>> id :- 1
>> Name:- Harry Pottor
>> Author:- J.K. Rolls
>>
>> Update history also need to store as :-
>> User_id :- 2
>> Old Author :- JK Rolling
>> New Author :- J.K. Rolls
>>
>> So I need to update the details of Book which is done by UPSERT. But also
>> I have to keep details like which user updated and what updated.
>>
>>
>> One thing that helps define the schema is knowing what queries will be
>> made to the database up front.
>> Few queries that the database needs to answer.
>> What are the current details of a book?
>> What is the most recent update to a particular book?
>> What are the updates that have been made to a particular book?
>> What are the details for a particular update?
>>
>>
>> Update frequently will be like Update will happen based on Title, name,
>> Author, price , publisher like. So not very high frequently.
>>
>> Best Regards,
>> Nandan
>>
>


Reg:- Data Modelling Concepts

2017-05-16 Thread @Nandan@
The requirement is to create DB in which we have to keep data of Updated
values as well as which user update the particular book details and what
they update.

We are like to create a schema which store book info, as well as the
history of the update, made based on book_title, author, publisher, price
changed.
Like we want to store what was old data and what new data updated.. and
also want to check which user updated the relevant change. Because suppose
if some changes not made correctly then they can check changes and revert
based on old values.
We are trying to make a USER based Schema.

For example:-
id:- 1
Name: - Harry Poter
Author : - JK Rolling

New Update Done by user_id 2:-
id :- 1
Name:- Harry Pottor
Author:- J.K. Rolls

Update history also need to store as :-
User_id :- 2
Old Author :- JK Rolling
New Author :- J.K. Rolls

So I need to update the details of Book which is done by UPSERT. But also I
have to keep details like which user updated and what updated.


One thing that helps define the schema is knowing what queries will be made
to the database up front.
Few queries that the database needs to answer.
What are the current details of a book?
What is the most recent update to a particular book?
What are the updates that have been made to a particular book?
What are the details for a particular update?


Update frequently will be like Update will happen based on Title, name,
Author, price , publisher like. So not very high frequently.

Best Regards,
Nandan


Reg:- DSE 5.1.0 Issue

2017-05-16 Thread @Nandan@
Hi ,
Sorry in Advance if I am posting here .

I stuck in some particular steps.

I was using DSE 4.8 on Single DC with 3 nodes. Today I upgraded my all 3
nodes to DSE 5.1
Issue is when I am trying to start SERVICE DSE RESTART i am getting error
message as

Hadoop functionality has been removed from DSE.
Please try again without the HADOOP_ENABLED set in /etc/default/dse.

Even in /etc/default//dse file , HADOOP_ENABLED is set as 0 .

For testing ,Once I changed my HADOOP_ENABLED = 1 ,

I  am getting error as

Found multiple DSE core jar files in /usr/share/dse/lib
/usr/share/dse/resources/dse/lib /usr/share/dse /usr/share/dse/common .
Please make sure there is only one.

I searched so many article , but till now not able to find the solution.
Please help me to get out of this mess.

Thanks and Best Regards,
Nandan Priyadarshi.


Reg:- Data Modelling based on Update History details

2017-05-15 Thread @Nandan@
Hi ,
I am currently working on Book Management System in which I have a table
which contains Books details in which PRIMARY KEY is book_id uuid.
The requirement is to create DB in which we have to keep data of Updated
values as well as which user update the particular book details and what
they update.

For example:-
id:- 1
Name: - Harry Poter
Author : - JK Rolling

New Update Done by user_id 2:-
id :- 1
Name:- Harry Pottor
Author:- J.K. Rolls

So I need to update the details of Book which is done by UPSERT. But also I
have to keep details like which user updated and what updated.

I hope, I am able to describe my scenario in details. Please suggest on
above scenario.


Reg:- CQL SOLR Query Not gives result

2017-05-11 Thread @Nandan@
Hi ,

In my table, I am having few records and implemented SOLR for partial
search but not able to retrieve data.

SELECT * from revall_book_by_title where solr_query = 'language:中';
SELECT * from revall_book_by_title where solr_query = 'language:中*';

None of them are working.
Any suggestions.


Reg:- Apache Solr with DSE Query

2017-05-11 Thread @Nandan@
Hi,

Details are as below:-
1) Have table:- video_info
2) PRIMARY KEY:- video_id UUID
3) having records around 5.
4) Table is having around 30-35 columns
5) Using DSE 4.8

Need clarifications and suggestions:-
1) I need to search by few 3-4 columns like Video_title, video_actor etc..
2) If I am implementing Solr indexing on this single table, then we can
able to do a query from other columns and much more.. but is it going to
effect my READ and WRITE speed.
3) is it will be a good idea or not to implement SOLR directly.

Please suggest on above.
Thanks.