Hi, Replicated vs partititoned is desicion that depends on many factors: data set size, growth rate, reliability, colocation requirements.
I don't know any details about your domain. So I can only make some assumptions. 1. "good" and "good_type" tables are dictionary tables. Usually tables like this are not big. So we can represent they as replicated caches. It means that all rows will be replicated on each cluster node. It leads to usefull effect: this tables don't require any colocation. 2. "day_report" table isn't big also. If I understand correctly this table contains one data row per day. So this table can b erepresented as replicated cache. 3. "user", "user_order" and "order_good" tables can have a lot of data and growth rate can be big enough. It makes sense to represent this tables as partitioned cache. 4. If you have partitioned caches that involved into crosscache queries then you should colocate data from this caches in order to provide correct joining and good performance. Thus data about user orders should be colocated with users (i.e. all user order entries should be on the same nodes where this user entries are placed) and data about goods in orders should be colocated with orders (i.e. all "order_good" rows for specific order should be placed on the same nodes where corresponding order entry is placed). About your questions: 1. Some tables do not have primary key, should I need to add primary key on > them first before importing them to Ignite? Yes, you should. Any entry in cache must have unique key. 2. Which tables should be setting as Replicated cache mode? And which table > should be setting as partition cache mode? See my analysis above. 3. Should I need to set any AffinityKey? Yes, you should. For example, in order to colocate user orders with users you should use AffinityKey for "user_order" table: AffinityKey<Integer>( user_order.id, user_order.usr_id). 4. Some join connections are not connected to primary key(such as > day_report.report_date = user_order.create_time where create_time is not a > primary key), Does it have any affect? > Primary key is a just unique not null key. Join can be executed on any field, but it would be better if this field is indexed. Probably in your case make sense to truncate user_order.create_time field to date type (without info about hours, minutes, etc). On Sat, Nov 7, 2015 at 5:22 PM, iceguo <[email protected]> wrote: > Thank you for your reply, but I am still not very clear about the rules of > setting a table for Replicated cache mode or Partition cache mode. > I have another a little more complicated example, could you help to > analysis > it? > > In this example, I have 6 tables > > /*Table structure for table `day_report` */ > CREATE TABLE `day_report` ( > `report_date` date DEFAULT NULL, > `money` decimal(10,0) DEFAULT NULL, > `people` int(11) DEFAULT NULL > ) ENGINE=InnoDB DEFAULT CHARSET=utf8; > > /*Table structure for table `good` */ > > CREATE TABLE `good` ( > `id` int(11) NOT NULL, > `name` varchar(200) DEFAULT NULL, > `type` int(11) DEFAULT NULL, > `money` decimal(10,0) DEFAULT NULL, > PRIMARY KEY (`id`) > ) ENGINE=InnoDB DEFAULT CHARSET=utf8; > > /*Table structure for table `good_type` */ > > CREATE TABLE `good_type` ( > `id` int(11) NOT NULL, > `name` varchar(200) DEFAULT NULL, > `pid` int(11) DEFAULT NULL, > PRIMARY KEY (`id`) > ) ENGINE=InnoDB DEFAULT CHARSET=utf8; > > /*Table structure for table `order_good` */ > > CREATE TABLE `order_good` ( > `order_id` int(11) DEFAULT NULL, > `good_id` int(11) DEFAULT NULL > ) ENGINE=InnoDB DEFAULT CHARSET=utf8; > > /*Table structure for table `user` */ > > CREATE TABLE `user` ( > `id` int(11) NOT NULL, > `name` varchar(50) DEFAULT NULL, > PRIMARY KEY (`id`) > ) ENGINE=InnoDB DEFAULT CHARSET=utf8; > > /*Table structure for table `user_order` */ > > CREATE TABLE `user_order` ( > `id` int(11) NOT NULL AUTO_INCREMENT, > `usr_id` int(11) DEFAULT NULL, > `create_time` date DEFAULT NULL, > `money` decimal(10,0) DEFAULT NULL, > KEY `id` (`id`) > ) ENGINE=InnoDB AUTO_INCREMENT=500000 DEFAULT CHARSET=utf8; > > The join connections of them as following: > day_report.report_date = user_order.create_time > user_order.type = good_type.pid > user_order.id = order_good.order_id > good.id = order_good.good_id > > My questions are: > 1. Some tables do not have primary key, should I need to add primary key on > them first before importing them to Ignite? > 2. Which tables should be setting as Replicated cache mode? And which table > should be setting as partition cache mode? > 3. Should I need to set any AffinityKey? > 4. Some join connections are not connected to primary key(such as > day_report.report_date = user_order.create_time where create_time is not a > primary key), Does it have any affect? > > > > -- > View this message in context: > http://apache-ignite-users.70518.x6.nabble.com/The-result-of-cross-cache-SQL-joins-is-incomplete-tp1723p1880.html > Sent from the Apache Ignite Users mailing list archive at Nabble.com. > -- Andrey Gura GridGain Systems, Inc. www.gridgain.com
