Greg,

Have you ever heard about ACID term related to databases? "A" in this 
abbreviation stands for Atomicity. Single queries even if is just SELECT 
considered as simple transactions. "Time" inside transaction is constant 
and is equal to a time when transaction starts. Your assumption that the 
result of "now()" function could be different inside this query is wrong.

Another example, for instance we have a table and we want to update the 
whole table by this query:

UPDATE table SET field1 = NOW();

This query could take up to several minutes if a table have millions and 
millions of records. Do you think the field1 will have different time 
stamp?


Greg:
> Try removing the t.timestamp > DATE_SUB( NOW( ) , INTERVAL 7 and
> replacing if with
> 
> t.timestamp > '$precalculated_timestamp'
> otherwise the DB has to perform that calculation on all the rows not
> excluded by the other search arguments, which looks like 65000 in your
> explain.  It can't assume that now() will return the same value for
> the entire life of the query you see.
> 
> Possibly using date_format('%format', t.timestamp) would be more
> efficient than the TIMESTAMP( CONCAT( YEAR( t.timestamp ) , "-", MONTH
> ( t.timestamp ) , "-", DAYOFMONTH( t.timestamp ) ) ) as well.
> 
> On May 13, 4:36 pm, gregor brabyn <gregorbra...@yahoo.co.uk> wrote:
>> Hi Dmitry
>>
>> EXPLAIN SELECT COUNT( * ) AS count, TIMESTAMP( CONCAT( YEAR( t.timestamp ) , 
>> "-", MONTH( t.timestamp ) , "-", DAYOFMONTH( t.timestamp ) ) ) AS division
>> FROM listing l, listing_impression t
>> WHERE l.id = t.listing_id
>> AND l.company_id =710
>> AND t.listing_type = 'full'
>> AND t.timestamp > DATE_SUB( NOW( ) , INTERVAL 7
>> DAY )
>> GROUP BY division
>> ORDER BY division;
>>
>> Returns 2 rows:
>>
>> id: 1
>> select_type: SIMPLE
>> table: l
>> type: ref
>> possible_keys: PRIMARY,company_id
>> key: company_id
>> key_len: 8
>> ref: const
>> rows: 1
>> Extra: Using temporary; Using filesort
>>
>> id: 1
>> select_type: SIMPLE
>> table: t
>> type: ref
>> possible_keys: listing_id
>> key: listing_id
>> key_len: 8
>> ref: database_name.l.id
>> rows: 8134
>> Extra: Using where
>>
>> SHOW TABLE STATUS LIKE 'listing_impression' returns:
>>
>> Name: listing_impression
>> Engine: MyISAM
>> Version: 10
>> Row_format: Dynamic
>> Rows: 26329761
>> Avg_row_length: 83
>> Data_length: 1027251968
>> Max_data_length: 281474976710655
>> Index_length: 492332032
>> Data_free: 0
>> Auto_increment: 27531811
>> Create_time: 2009-01-29 16:25:51
>> Update_time: 2009-05-13 15:09:47
>> Check_time: 2009-02-03 00:38:42
>> Collation: latin1_swedish_ci    
>> Checksum: NULL          
>> Create_options:
>> Comment:
>>
>> Since the listing_impression table is having updates on it all the time I am 
>> worried that adding the new index will result in the database seizing up 
>> with too many connections.
>>
>> I am making some enquiries but it looks like just the date would do the 
>> trick so timestamp is not required. I am however worried about changing the 
>> data type of a field in such a large table.
>>
>> Would it make much difference if the query
>>
>> SELECT COUNT(*) AS count ,
>>                 TIMESTAMP(CONCAT(YEAR(t.timestamp), "-", MONTH(t.timestamp), 
>> "-", DAYOFMONTH(t.timestamp))) AS division
>> FROM listing l, listing_view t
>> WHERE l.id=t.listing_id
>>         AND l.company_id=710
>>         AND t.timestamp > DATE_SUB(NOW(), INTERVAL 7 day)
>> GROUP BY division
>> ORDER BY division
>>
>> was changed so that id did not include the current day, since the current 
>> day is always having rows added to it?
>>
>> Cheers
>> Greg
>>
>> --- On Wed, 13/5/09, Dmitry Ruban <dmi...@ruban.biz> wrote:
>>
>>> From: Dmitry Ruban <dmi...@ruban.biz>
>>> Subject: [phpug] Re: [phpug][OT] Large Database Tables
>>> To: nzphpug@googlegroups.com
>>> Date: Wednesday, 13 May, 2009, 2:38 PM
>>> Try to add this index it should help in first case:
>>> CREATE INDEX idx1 ON
>>> listing_impression (`listing_id`, `listing_type`,
>>> `timestamp`);
>>> also, if you could post here result of the following
>>> queries that would
>>> help to analyse your problem further:
>>> 1)
>>> EXPLAIN SELECT COUNT(*) AS count ,
>>> TIMESTAMP(CONCAT(YEAR(t.timestamp), "-", MONTH(t.timestamp),
>>> "-",
>>> DAYOFMONTH(t.timestamp))) AS division
>>> FROM listing l, listing_impression t
>>> WHERE l.id=t.listing_id
>>>     AND l.company_id=710
>>>     AND t.listing_type = 'full'
>>>     AND t.timestamp > DATE_SUB(NOW(),
>>> INTERVAL 7 day)
>>> GROUP BY division
>>> ORDER BY division;
>>> 2)
>>> SHOW TABLE STATUS LIKE 'listing_impression'\G
>>> Is it important to keep `timestamp` of whatever you are
>>> tracking? Would
>>>   YYYY-mm-dd be enough?
>>> gregor brabyn:
>>>> The structure of the tables involved are as follows:
>>>> CREATE TABLE `listing_impression` (
>>>>   `id` bigint(20) NOT NULL
>>> auto_increment,
>>>>   `listing_id` bigint(20) NOT NULL,
>>>>   `url` varchar(255) NOT NULL,
>>>>   `timestamp` timestamp NOT NULL
>>> default CURRENT_TIMESTAMP,
>>>>   `listing_type` enum('full','free')
>>> NOT NULL default 'full',
>>>>   PRIMARY KEY  (`id`),
>>>>   KEY `listing_id` (`listing_id`),
>>>>   KEY `from` (`url`)
>>>> ) ENGINE=MyISAM  DEFAULT CHARSET=latin1
>>> AUTO_INCREMENT=27527384 ;
>>>> CREATE TABLE `listing` (
>>>>   `id` bigint(20) unsigned NOT NULL
>>> auto_increment,
>>>>   `category_id` bigint(20) unsigned NOT
>>> NULL default '0',
>>>>   `company_id` bigint(20) unsigned NOT
>>> NULL default '0',
>>>>   `region_id` bigint(20) unsigned NOT
>>> NULL default '0',
>>>>   `href` varchar(255) NOT NULL,
>>>>   `title` varchar(50) NOT NULL,
>>>>   `details` text NOT NULL,
>>>>   `recommendations` varchar(255) NOT
>>> NULL default '',
>>>>   `info` tinytext NOT NULL,
>>>>   `address` varchar(100) NOT NULL
>>> default '',
>>>>   `city` varchar(50) NOT NULL default
>>> '',
>>>>   `phone` varchar(50) NOT NULL,
>>>>   `secondary_phone` varchar(50) NOT
>>> NULL,
>>>>   `cellphone` varchar(50) NOT NULL,
>>>>   `fax` varchar(50) NOT NULL,
>>>>   `website` varchar(50) NOT NULL
>>> default '',
>>>>   `email` varchar(100) NOT NULL,
>>>>   `contact_name` varchar(50) NOT NULL
>>> default '',
>>>>   `sort` smallint(6) NOT NULL,
>>>>   `map_display` tinyint(4) NOT NULL
>>> default '0',
>>>>   `map_address` varchar(128) NOT NULL,
>>>>   `map_geocode` varchar(128) NOT NULL,
>>>>   `enable_vianet` tinyint(4) NOT NULL,
>>>>   `location_id_vianet` int(11) NOT
>>> NULL,
>>>>   `creation_date` timestamp NOT NULL
>>> default CURRENT_TIMESTAMP,
>>>>   `last_updated` datetime NOT NULL,
>>>>   `last_view_milestone` int(11) NOT
>>> NULL default '0',
>>>>   `status`
>>> enum('active','disabled','pending','deleted') NOT NULL
>>> default 'active',
>>>>   `type` enum('full','free') NOT NULL
>>> default 'full',
>>>>   PRIMARY KEY  (`id`),
>>>>   KEY `company_id` (`company_id`),
>>>>   KEY `region_id` (`region_id`),
>>>>   KEY `category_id` (`category_id`),
>>>>   KEY `category_teambuilding_id`
>>> (`category_teambuilding_id`),
>>>>   FULLTEXT KEY `details` (`details`),
>>>>   FULLTEXT KEY `all`
>>> (`title`,`details`)
>>>> ) ENGINE=MyISAM  DEFAULT CHARSET=latin1
>>> PACK_KEYS=0 AUTO_INCREMENT=2669 ;
>>>> The query that slows a web page load down by typically
>>> 30 seconds is:
>>>> SELECT COUNT(*) AS count ,
>>> TIMESTAMP(CONCAT(YEAR(t.timestamp), "-", MONTH(t.timestamp),
>>> "-", DAYOFMONTH(t.timestamp))) AS division
>>>> FROM listing l, listing_impression t
>>>> WHERE l.id=t.listing_id
>>>>     AND l.company_id=710
>>>>     AND t.listing_type = 'full'
>>>>     AND t.timestamp >
>>> DATE_SUB(NOW(), INTERVAL 7 day)
>>>> GROUP BY division
>>>> ORDER BY division
>>>> However the query I have made to summarise the listing
>>> impressions for the day (to do data warehousing) is as
>>> follows:
>>>> SELECT COUNT(listing_id) AS count_impressions,
>>> listing_id
>>>> FROM listing_impression
>>>> WHERE timestamp LIKE '2008-05-06%'
>>>> GROUP BY listing_id
>>>> The listing table only has about 1800 records, whereas
>>> listing_impression over 26,000,000.
>>>> Ideas for speeding it all up are welcome.
>>>> Greg
>>>> --- On Wed, 13/5/09, ctx2002 <ctx2...@gmail.com>
>>> wrote:
>>>>> From: ctx2002 <ctx2...@gmail.com>
>>>>> Subject: [phpug] Re: [OT] Large Database Tables
>>>>> To: "NZ PHP Users Group" <nzphpug@googlegroups.com>
>>>>> Date: Wednesday, 13 May, 2009, 10:52 AM
>>>>> really need to see your table structure and sql ,
>>> so we
>>>>> know what is
>>>>> problem.
>>>>> On May 12, 10:51 pm, "Gregor Brabyn" <gregorbra...@yahoo.co.uk>
>>>>> wrote:
>>>>>> Anyone got any good advice on dealing with
>>> large
>>>>> database tables.
>>>>>> I have a page in the client area of a website
>>> that
>>>>> normally takes over 30
>>>>>> seconds to load almost purely due to a query
>>> on a
>>>>> large database table. The
>>>>>> table records impressions for listings, each
>>> row
>>>>> represents one listing
>>>>>> getting an impression. The table now has over
>>>>> 26,000,000 rows accumulated
>>>>>> over a period just exceeding 1000 days.
>>>>>> I have a plan to do a bit of data warehousing
>>> so that
>>>>> the total number of
>>>>>> impressions for a listing are summarized into
>>> one row
>>>>> in a separate table.
>>>>>> Once it is set up a CRON will be able to run
>>> each
>>>>> night and summarize the
>>>>>> impressions into this new table. Queries
>>> should be
>>>>> able to read data quickly
>>>>>> off this table.
>>>>>> The problem I am having is that I need to
>>> summarize
>>>>> all the past data which
>>>>>> spans over 1000 days and I am finding that the
>>> MySQL
>>>>> query to collect one
>>>>>> day's worth of summaries can easily take 5
>>> minutes. I
>>>>> am thinking that if I
>>>>>> run a CRON to summarize a day's data into the
>>>>> warehousing table every 15
>>>>>> minutes then it is going to take something
>>> like 11
>>>>> days.
>>>>>> At present I am also unsure of the settings on
>>> the
>>>>> Apache server, I don't
>>>>>> think I will be able to have scripts operating
>>> for
>>>>> more that 10 minutes
>>>>>> before Apache stops them and at busy times of
>>> day my
>>>>> script to summarize a
>>>>>> day's data could easily end up exceeding
>>> this.
>>>>>> Anyone got experience and good advice on
>>> dealing with
>>>>> these issues.
>>>>>> Cheers
>>>>>> Greg
> 
> > 
> 


--~--~---------~--~----~------------~-------~--~----~
NZ PHP Users Group: http://groups.google.com/group/nzphpug
To post, send email to nzphpug@googlegroups.com
To unsubscribe, send email to
nzphpug+unsubscr...@googlegroups.com
-~----------~----~----~----~------~----~------~--~---

Reply via email to