Re: Why does a group_concat on a join change aggregate values?

2015-05-07 Thread Paul Halliday
Should have showed the whole thing. Take a look here (click image to see
full output):

http://www.pintumbler.org/tmp

On Thu, May 7, 2015 at 4:11 PM, shawn l.green 
wrote:

> Hi Paul,
>
> On 5/7/2015 10:17 AM, Paul Halliday wrote:
>
>> Fighting a bit with this one...
>>
>> If I do something like (pseudo):
>>
>> SELECT count(val) AS n, GROUP_CONCAT(types) AS c_types FROM tbl1
>>
>> returns something like:
>>
>> n  c_types
>> 1  t9
>>
>> when I add a left join though:
>>
>> SELECT count(val) AS n, GROUP_CONCAT(types) AS c_types,
>> GROUP_CONCAT(two.types) AS d_types FROM tbl1 LEFT JOIN tbl2 AS two ON
>> tbl1.id = tbl2.id
>>
>> returns something like:
>>
>> val  c_types  d_types
>> 3t9,t9,t9 a2,a3,a9
>>
>> I can have as many group_concats against the same table with varying
>> results and they don't affect COUNT() but once I do that JOIN things start
>> to fall apart.
>>
>> What is happening behind the scenes?
>>
>> Thanks!
>>
>>  Here's a simple test. Change the query to no longer have the aggregate
> functions, then start counting rows by hand.
>
> SELECT
>   val AS n
> , types AS c_types
> , two.types AS d_types
> FROM tbl1
> LEFT JOIN tbl2 AS two
>   ON tbl1.id = tbl2.id
>
> The other thing that springs to mind is that you lack a GROUP BY in your
> query. It isn't required but they can often help get you to the correct
> answer.
>
> Best regards,
>
> --
> Shawn Green
> MySQL Senior Principal Technical Support Engineer
> Oracle USA, Inc. - Hardware and Software, Engineered to Work Together.
> Office: Blountville, TN
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:http://lists.mysql.com/mysql
>
>


-- 
Paul Halliday
http://www.pintumbler.org/


Why does a group_concat on a join change aggregate values?

2015-05-07 Thread Paul Halliday
Fighting a bit with this one...

If I do something like (pseudo):

SELECT count(val) AS n, GROUP_CONCAT(types) AS c_types FROM tbl1

returns something like:

n  c_types
1  t9

when I add a left join though:

SELECT count(val) AS n, GROUP_CONCAT(types) AS c_types,
GROUP_CONCAT(two.types) AS d_types FROM tbl1 LEFT JOIN tbl2 AS two ON
tbl1.id = tbl2.id

returns something like:

val  c_types  d_types
3t9,t9,t9 a2,a3,a9

I can have as many group_concats against the same table with varying
results and they don't affect COUNT() but once I do that JOIN things start
to fall apart.

What is happening behind the scenes?

Thanks!

-- 
Paul Halliday
http://www.pintumbler.org/


Re: Help with REGEXP

2015-03-19 Thread Paul Halliday
I don't think it accepts \d, or much of anything else I am used to
putting in expressions :)

This is what I ended up with and it appears to be working:

REGEXP '10.[[:alnum:]]{1,3}.(22[4-9]|23[0-9]).[[:alnum:]]{1,3}'



On Thu, Mar 19, 2015 at 11:10 AM, Michael Dykman  wrote:
> Trying to pattern match ip addresses is a famous anti-pattern; it's one of
> those things like you feel like it should work, but it won't.
>
> Your case, however, is pretty specific. taking advantage of the limited
> range (I will assume you only wanted 4 sections of IPv4)
>
> this should come close:
>
> 10[.]\d{1,3}[.](224|225|226|227|228|229|23\d))[.]\d{1.3}
>
> On Thu, Mar 19, 2015 at 9:39 AM, Paul Halliday 
> wrote:
>
>> I am trying to pick out a range of IP addresses using REGEXP but
>> failing miserably :)
>>
>> The pattern I want to match is:
>>
>> 10.%.224-239.%.%
>>
>> The regex I have looks like this:
>>
>> AND INET_NTOA(src_ip) REGEXP
>> '\d{1,3}\\.\d{1,3}\.(22[4-9]|23[0-9])\\.\d{1,3}'
>>
>> but, go fish. Thoughts?
>>
>>
>> Thanks!
>>
>> --
>> Paul Halliday
>> http://www.pintumbler.org/
>>
>> --
>> MySQL General Mailing List
>> For list archives: http://lists.mysql.com/mysql
>> To unsubscribe:http://lists.mysql.com/mysql
>>
>>
>
>
> --
>  - michael dykman
>  - mdyk...@gmail.com
>
>  May the Source be with you.



-- 
Paul Halliday
http://www.pintumbler.org/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Help with REGEXP

2015-03-19 Thread Paul Halliday
I am trying to pick out a range of IP addresses using REGEXP but
failing miserably :)

The pattern I want to match is:

10.%.224-239.%.%

The regex I have looks like this:

AND INET_NTOA(src_ip) REGEXP '\d{1,3}\\.\d{1,3}\.(22[4-9]|23[0-9])\\.\d{1,3}'

but, go fish. Thoughts?


Thanks!

-- 
Paul Halliday
http://www.pintumbler.org/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



InnoDB error 5

2013-11-21 Thread Paul Halliday
Had a system crash this morning and I can't seem to get mysql back up
and running. This is the error:

InnoDB: Progress in percent: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
87 88 89 90 91 92 93 94 95 96 97 98 99 2013-11-21 08:47:26 1570
[ERROR] InnoDB: Tried to read 16384 bytes at offset 589824. Was only
able to read -1.
2013-11-21 08:47:26 802808c00  InnoDB: Operating system error number 5
in a file operation.
InnoDB: Error number 5 means 'Input/output error'.
InnoDB: Some operating system error numbers are described at
InnoDB: http://dev.mysql.com/doc/refman/5.6/en/operating-system-error-codes.html
2013-11-21 08:47:26 802808c00  InnoDB: File operation call: 'read'
returned OS error 105.
2013-11-21 08:47:26 802808c00  InnoDB: Cannot continue operation.

I followed that link but it doesn't tell me anything outside of what
is above. Can I fix this?

Thanks.

-- 
Paul Halliday
http://www.pintumbler.org/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: InnoDB error 5

2013-11-21 Thread Paul Halliday
It was indeed corruption :/ what a day. I was able to move everything
over to another partition and have managed to get mysql up and running
again.  There was a single file I could not, an .idb (the ,.frm is
there). Is it possible to fix this from ibdata or the logs?

Thanks.

On Thu, Nov 21, 2013 at 9:46 AM, Manuel Arostegui  wrote:
> 2013/11/21 Reindl Harald 
>
>>
>> Am 21.11.2013 13:51, schrieb Paul Halliday:
>> > Had a system crash this morning and I can't seem to get mysql back up
>> > and running. This is the error:
>> >
>> > InnoDB: Progress in percent: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
>> > 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
>> > 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
>> > 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86
>> > 87 88 89 90 91 92 93 94 95 96 97 98 99 2013-11-21 08:47:26 1570
>> > [ERROR] InnoDB: Tried to read 16384 bytes at offset 589824. Was only
>> > able to read -1.
>> > 2013-11-21 08:47:26 802808c00  InnoDB: Operating system error number 5
>> > in a file operation.
>> > InnoDB: Error number 5 means 'Input/output error'.
>> > InnoDB: Some operating system error numbers are described at
>> > InnoDB:
>> http://dev.mysql.com/doc/refman/5.6/en/operating-system-error-codes.html
>> > 2013-11-21 08:47:26 802808c00  InnoDB: File operation call: 'read'
>> > returned OS error 105.
>> > 2013-11-21 08:47:26 802808c00  InnoDB: Cannot continue operation.
>> >
>> > I followed that link but it doesn't tell me anything outside of what
>> > is above. Can I fix this?
>>
>> i would look in the *system logs* because this pretty sure comes
>> from the underlying operating system and is *not* mysql specific
>> which is also in the message statet with "returned OS error 105"
>>
>>
>> http://lxr.free-electrons.com/source/include/uapi/asm-generic/errno-base.h#L8
>>
>>
> Looks like a broken disk or FS corruption :-(
>
> Good luck!
> Manuel.



-- 
Paul Halliday
http://www.pintumbler.org/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



RE: How do I select all rows of table that have some rows in another table (AND, not OR)

2013-06-12 Thread Paul Halliday
I am so, so glad that someone finally said what I think each time I see a 
message from you Mr. James. 

 Original message 
From: Rick James  
Date: 06-12-2013  8:45 PM  (GMT-04:00) 
To: Daevid Vincent ,mysql@lists.mysql.com 
Subject: RE: How do I select all rows of table that have some rows in
  another table (AND, not OR) 
 
"I'm the ORIGINAL Rick James, B"  (And, I'm still alive.)  LOL

If you are using PHP, you might want to stop at 31 bits per INT/SET.  PHP seems 
not to yet be in the 64-bit world.

> -Original Message-
> From: Daevid Vincent [mailto:dae...@daevid.com]
> Sent: Wednesday, June 12, 2013 11:26 AM
> To: mysql@lists.mysql.com
> Subject: RE: How do I select all rows of table that have some rows in
> another table (AND, not OR)
> 
> This idea is so fancy pants and clever I *wish* it could have worked for
> me.
> I checked and we actually have 65 genres currently (with more to come I'm
> sure) LOL *FML*. I'm almost ashamed I didn't think of this. I <3 me some
> bitmasks and this solution is so elegant. It's unfortunate there isn't a
> way to use more than 64-bits natively.
> 
> You're RICK JAMES Bitch! :-p   (please tell me you know the Dave
> Chappelles
> skit I'm referring to)
> 
> > -Original Message-
> > From: Rick James [mailto:rja...@yahoo-inc.com]
> > Sent: Wednesday, June 12, 2013 9:39 AM
> > To: Daevid Vincent; mysql@lists.mysql.com
> > Cc: 'shawn green'
> > Subject: RE: How do I select all rows of table that have some rows in
> > another table (AND, not OR)
> >
> > Thinking out of the box... (And posting my reply at the 'wrong' end of
> > the email.)...
> >
> > Are there fewer than 64 genres?  Use a SET or BIGINT UNSIGNED.
> >
> > > > >  AND sg.`genre_id` IN (10,38)
> > > > >  AND sg.`genre_id` NOT IN (22,61)
> >
> > -->
> > AND  genre & ((1<<10) | (1<<38)) != 0
> > AND  genre & ((1<<22) | (1<<61))  = 0
> >
> > This would avoid having that extra table, and save a lot of space.
> >
> > If you have more than 64 genres, then
> > Plan A: clump them into some kind of application grouping and use
> > multiple INTs/SETs.
> > Plan B: do mod & div arithmetic to compute which genre field to tackle.
> >
> > For B, something like:
> > AND  (genre1 & (1<<0)) + (genre3 & (1<<8)) != 0
> > AND  (genre2 & (1<<2)) + (genre6 & (1<<1))  = 0 (That's assuming
> > 10 bits per genre# field.  I would use 32 and INT
> > UNSIGNED.)
> >
> >
> >
> > > -Original Message-
> > > From: Daevid Vincent [mailto:dae...@daevid.com]
> > > Sent: Tuesday, June 11, 2013 4:17 PM
> > > To: mysql@lists.mysql.com
> > > Cc: 'shawn green'
> > > Subject: RE: How do I select all rows of table that have some rows
> > > in another table (AND, not OR)
> > >
> > >
> > >
> > > > -Original Message-
> > > > From: shawn green [mailto:shawn.l.gr...@oracle.com]
> > > > Sent: Tuesday, June 11, 2013 2:16 PM
> > > > To: mysql@lists.mysql.com
> > > > Subject: Re: How do I select all rows of table that have some rows
> > > > in another table (AND, not OR)
> > > >
> > > > Hello Daevid,
> > > >
> > > > On 6/11/2013 3:59 PM, Daevid Vincent wrote:
> > > > > I am trying to implement a filter so that a user could select
> > > > > various
> > > > genres
> > > > > they want "in" or "out". Perhaps they like 'action' and 'car
> chases'
> > > > > but don't like 'foreign' and 'drifting' (or whatever...)
> > > > >
> > > > > So I want something sort of like this, however IN() is using an
> "OR"
> > > > > comparison when I need it to be an "AND"
> > > > >
> > > > > SELECT DISTINCT
> > > > >  s.`scene_id` AS `id`,
> > > > >  GROUP_CONCAT(sg.`genre_id`) FROM
> > > > >  `dvds` AS d
> > > > >  JOIN `scenes_list` AS s
> > > > >  ON s.`dvd_id` = d.`dvd_id`
> > > > >  JOIN `scenes_genres` AS sg
> > > > >  ON sg.`scene_id` = s.`scene_id`
> > > > >  AND sg.`genre_id` IN (10,38)
> > > > >  AND sg.`genre_id` NOT IN (22,61) GROUP BY
> > > > > s.`scene_id`;
> > > > >
> > > > > This is giving me way way too many rows returned.
> > > > >
> > > > > For example, I would expect this scene_id to be in the result set:
> > > > >
> > > > > SELECT * FROM scenes_genres WHERE scene_id = 17;
> > > > >
> > > > > scene_id  genre_id
> > > > >   --
> > > > >    17   1
> > > > >    17   3
> > > > >    17  10 <--
> > > > >    17  19
> > > > >    17  38 <--
> > > > >    17  53
> > > > >    17  58
> > > > >    17  59
> > > > >
> > > > > And this scene ID to NOT be in the result set:
> > > > >
> > > > > SELECT * FROM scenes_genres WHERE scene_id = 11;
> > > > >
> > > > > scene_id  genre_id
> > > > >   --
> > > > >    11   1
> > > > >    11  10 <--
> > > > >    11  19
> > > > >    11  31
> > > > >    11  32
> > 

Re: query help

2012-07-31 Thread Paul Halliday
On Tue, Jul 31, 2012 at 7:02 PM, Haluk Karamete  wrote:
> I've got a text field called source_recid. It stores half string half
> number like strings in it.
>
> Example
>
> shop.orders.32442
>
> the syntax is DATABASENAME.TABLENAME.RECID
>
> My goal is to scan this col and find out the biggest RECID ( the
> integer) in it.
>
> So, in a case like this
>
> shop.orders.32442
> shop.orders.82000
> shop.orders.34442
>
> It would be the record whose source_recid  is shop.orders.82000. Why? Cause
> 82000 happens to be the largest integer.
>
> What SQL statement would get me that record?
>
> One option to this is to create a new column ( the_ids ) and move all the
> integers in it and then run something like this
>
> select source_recid from mytable where source_recid like 'shop.orders.%'
> order by the_ids DESC LIMIT 1
>
> Is there a way to pull this off without going thru this step?

Would substring work?

SELECT SUBSTRING(recid, 13, ) AS numbers FROM table ORDER
BY numbers DESC


-- 
Paul Halliday
http://www.pintumbler.org/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Group expansion as part of the result

2012-03-27 Thread Paul Halliday
On Tue, Mar 27, 2012 at 3:43 PM, Dan Nelson  wrote:
> In the last episode (Mar 27), Paul Halliday said:
>> Say I have:
>>
>> SELECT COUNT(name), name, COUNT(DISTINCT(status)) FROM table GROUP BY name
>>
>> and it returns:
>>
>> 20 paul 5
>> 19 john 2
>> 75 mark 3
>>
>> is there a way to return what comprises  DISTINCT(status) as part of the 
>> result?
>>
>> so:
>>
>> 20 paul 2,3,1,20,9
>> 19 john 20,9
>> 75 mark 1,20,9
>
> You want GROUP_CONCAT:
>

Heh, and exactly how I wanted it formatted.

Great! Thanks.

-- 
Paul Halliday
http://www.squertproject.org/

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Delete from another table on update.

2012-02-06 Thread Paul Halliday
Is it possible to wrap a DELETE statement in an ON DUPLICATE KEY UPDATE?

Something like: ON DUPLICATE KEY UPDATE host="b1" (DELETE FROM
another_table WHERE host="b1") ?

Thanks.

-- 
Paul Halliday
http://www.squertproject.org/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql



Re: Inconsistent query result.

2011-10-11 Thread Paul Halliday
On Tue, Oct 11, 2011 at 10:37 AM, Peter Brawley
 wrote:
> On 10/11/2011 8:11 AM, Paul Halliday wrote:
>>
>> I have the following query:
>>
>> SELECT COUNT(signature) AS count, MAX(timestamp) AS maxTime,
>> INET_NTOA(src_ip), map1.cc as src_cc, INET_NTOA(dst_ip), map2.cc as
>> dst_cc, signature, signature_id, ip_proto FROM event LEFT JOIN
>> mappings AS map1 ON event.src_ip = map1.ip LEFT JOIN mappings AS map2
>> ON event.dst_ip = map2.ip WHERE timestamp BETWEEN '2011-10-11
>> 03:00:00' AND '2011-10-12 02:59:59' AND (map1.cc != 'US' OR map2.cc !=
>> 'US') AND (map1.c_long != 'US' OR map2.c_long != 'US') GROUP BY
>> src_ip, src_cc, dst_ip, dst_cc, signature, signature_id, ip_proto
>> ORDER BY maxTime DESC LIMIT 5000
>>
>> The part that is causing the strange result is probably this:
>>
>> AND (map1.cc != 'US' OR map2.cc != 'US') AND (map1.c_long != 'US' OR
>> map2.c_long != 'US')
>>
>> the value could be 'US' or 'UNITED STATES' depending on user input so
>> I check both table fields against their input.
>>
>> On my system (fbsd 5.1.48) 'US' is not shown, on another system (linux
>> 5.1.54), 'US' appears in the results.
>>
>> Is there a better way to write this?
>
> Is this what you mean?
>
> SELECT
>  COUNT(signature) AS count, MAX(timestamp) AS maxTime, INET_NTOA(src_ip),
> map1.
>  cc as src_cc, INET_NTOA(dst_ip), map2.cc as dst_cc,
>  signature, signature_id, ip_proto
> FROM event
> LEFT JOIN mappings AS map1 ON event.src_ip = map1.ip AND map1.cc != 'US' AND
> map1.c_long != 'United States'
> LEFT JOIN mappings AS map2 ON event.dst_ip = map2.ip AND map2.cc != 'US' AND
> map2.c_long != 'United States'
> WHERE timestamp BETWEEN '2011-10-11 03:00:00' AND '2011-10-12 02:59:59'
> GROUP BY src_ip, src_cc, dst_ip, dst_cc, signature, signature_id, ip_proto
> ORDER BY maxTime DESC LIMIT 5000
>
> PB
>

Is it OK to keep adding to those joins? In a view there could be say
50 countries. The user can keep on adding more to exclude. So would I
just expand on the ANDs like so:

AND map1.cc !='CA' AND map1.cc !='US' AND map1.cc !='LV'... ?

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Inconsistent query result.

2011-10-11 Thread Paul Halliday
I have the following query:

SELECT COUNT(signature) AS count, MAX(timestamp) AS maxTime,
INET_NTOA(src_ip), map1.cc as src_cc, INET_NTOA(dst_ip), map2.cc as
dst_cc, signature, signature_id, ip_proto FROM event LEFT JOIN
mappings AS map1 ON event.src_ip = map1.ip LEFT JOIN mappings AS map2
ON event.dst_ip = map2.ip WHERE timestamp BETWEEN '2011-10-11
03:00:00' AND '2011-10-12 02:59:59' AND (map1.cc != 'US' OR map2.cc !=
'US') AND (map1.c_long != 'US' OR map2.c_long != 'US') GROUP BY
src_ip, src_cc, dst_ip, dst_cc, signature, signature_id, ip_proto
ORDER BY maxTime DESC LIMIT 5000

The part that is causing the strange result is probably this:

AND (map1.cc != 'US' OR map2.cc != 'US') AND (map1.c_long != 'US' OR
map2.c_long != 'US')

the value could be 'US' or 'UNITED STATES' depending on user input so
I check both table fields against their input.

On my system (fbsd 5.1.48) 'US' is not shown, on another system (linux
5.1.54), 'US' appears in the results.

Is there a better way to write this?

Thanks.

-- 
Paul Halliday
http://www.squertproject.org/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Substring confusion.

2011-07-15 Thread Paul Halliday
Does anyone know why this happens:

mysql> SELECT COUNT(*) AS count, SUBSTRING(timestamp,12,5) AS time
FROM event WHERE timestamp BETWEEN '2011-07-15 03:00:00' AND
'2011-07-16 02:59:59' GROUP BY time ORDER BY time limit 4;
+---+---+
| count | time  |
+---+---+
| 5 | 03:00 |
| 2 | 03:01 |
| 2 | 03:02 |
| 5 | 03:03 |
+---+---+

mysql> SELECT COUNT(*) AS count, SUBSTRING(timestamp,12,5) AS time
FROM event WHERE timestamp BETWEEN '2011-07-14 03:00:00' AND
'2011-07-15 02:59:59' GROUP BY time ORDER BY time limit 4;
+---+---+
| count | time  |
+---+---+
| 8 | 00:00 |
| 4 | 00:01 |
| 3 | 00:02 |
| 1 | 00:03 |
+---+---+

First one returns what I expect, when I do it on a day in the past, it
always starts at 00:00 instead of 03:00.

If I change it a bit though, I get what I expect:

mysql> SELECT COUNT(*) AS count, SUBSTRING(timestamp,10,7) AS time
FROM event WHERE timestamp BETWEEN '2011-07-14 03:00:00' AND
'2011-07-15 02:59:59' GROUP BY time ORDER BY time limit 4;
+---+-+
| count | time|
+---+-+
| 4 | 4 03:00 |
| 6 | 4 03:01 |
| 1 | 4 03:02 |
|     2 | 4 03:03 |
+---+-+

Why is this?

Thanks.

-- 
Paul Halliday
http://www.squertproject.org/

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Design question.

2011-02-17 Thread Paul Halliday
I maintain a little open source project that deals with IDS alert
data. I want to add IP reputation to my event queries and I am stuck
on how I should implement it.

The user will have the option of bringing in lists from different
providers and the limit will not be fixed. These lists will be a
single column of IP addresses.

list 1: IP listing
list 2: IP listing
list 3: IP listing
...

There can, and most likely will be duplication of addresses across the
different lists. The number of lists that a host is a member of will
be an indication of its reputation.

The desired result will be something like:

event count | event signature | src ip | country | ip reputation | dst
ip | country | ip reputation

The lists will be updated once each day or on demand.

I already have a mappings table that provides country information for
ip's in the event table which is joined during the event queries. The
mappings table contains a little under 500,000 addresses and grows
slowly - say 50 to 100 addresses / day. As new ip's appear in the
event table, they are mapped to a country.

Questions:

1) Should I just create a new table for every list the user adds and
then do joins on these?
2) Should I put the lists in 1 table somehow?
3) As the lists are done daily, should I just run a midnight task that
parses each list and adds the information to the mappings table. I
have no idea what the format would look like. I was thinking of
creating a varchar and have something like: list1|list2|list9|list20
and then just breaking it out in the code. The entire table would of
course need to be scanned each day to check whether or not an address
had been taken off a list. (efficiency?)

Any comments/suggestions would be greatly appreciated.

Thanks.
-- 
Paul Halliday
http://www.pintumbler.org

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Help with query.

2011-02-01 Thread Paul Halliday
I have a query (thanks to this list) that uses a join to add country
information to an IP. It looks like this:

SELECT COUNT(signature) AS count, INET_NTOA(src_ip), map1.cc as
src_cc, INET_NTOA(dst_ip), map2.cc as dst_cc
FROM event LEFT JOIN mappings AS map1 ON event.src_ip = map1.ip LEFT
JOIN mappings AS map2 ON event.dst_ip = map2.ip
WHERE timestamp BETWEEN '2011-01-29 00:00:00' AND '2011-01-30 00:00:00'
GROUP BY src_ip, src_cc, dst_ip, dst_cc
ORDER BY src_cc, dst_cc ASC;

This would return something like this:

 +---+---++---++
| count | INET_NTOA(src_ip) | src_cc | INET_NTOA(dst_ip) | dst_cc |
+---+---++---++
| 8 | 10.0.0.8   | NULL   | 61.55.142.129 | CN |
| 1 | 210.52.216.92 | CN | 10.0.0.2 | NULL   |
| 1 | 121.33.205.235| CN | 172.16.0.6| NULL   |
|   239 | 210.52.216.92 | CN | 10.0.0.2| NULL   |
| 2 | 121.33.205.235| CN | 172.16.0.15   | NULL   |
| 4 | 121.33.205.235| CN | 10.0.0.1| NULL   |
|39 | 210.52.216.92 | CN | 172.16.0.15   | NULL   |
| 1 | 121.33.205.235| CN | 172.16.0.14   | NULL   |
+---+---++---++

All I am interested in is the event count for each country, in this case:

295 CN
... Other countries..

I can do this in code, more work of course, but I am just curious if I
can pull it off with a single query.

Thanks!
-- 
Paul Halliday
http://www.pintumbler.org

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Lowest non-zero number

2010-12-03 Thread Paul Halliday
On Fri, Dec 3, 2010 at 12:50 PM, Mark Goodge  wrote:
> Given a table containing a range of INT values, is there any easy way to
> select from it the lowest non-zero number?
>

SELECT number FROM table WHERE number > 0 ORDER BY number ASC LIMIT 1;

?
-- 
Paul Halliday
Ideation | Individualization | Learner | Achiever | Analytical
http://www.pintumbler.org

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Missing results.

2010-11-04 Thread Paul Halliday
> Presumably those records were absorbed into your 'group by' clause, since 
> there was an entry, from a later time, which had the same values for all the 
> group by fields.
>
>
> --
> Simcha Younger 
>

Geez, how obvious. I was thinking on a completely different plane. I
feel pretty silly now. :)

Thanks.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Missing results.

2010-11-04 Thread Paul Halliday
I can't wrap my head around this one..

I have this query:

SELECT COUNT(signature) AS count, MAX(timestamp) AS maxTime,
INET_NTOA(src_ip), src.cc, INET_NTOA(dst_ip), dst.cc FROM event LEFT
JOIN mappings AS src ON event.src_ip = src.ip LEFT JOIN mappings AS
dst ON event.dst_ip = dst.ip WHERE timestamp BETWEEN '2010-11-03
13:00:00' AND '2010-11-03 14:00:00' AND (signature LIKE '%ET POLICY
facebook apps activity%') AND (INET_NTOA(src_ip) LIKE '10.13.11.29' OR
INET_NTOA(dst_ip) LIKE '10.13.11.29') GROUP BY src_ip, src.cc, dst_ip,
dst.cc ORDER BY maxTime DESC;

which returns something like:

12010-11-03 13:41:50
380  2010-11-03 13:41:46 <-
24   2010-11-03 13:22:39 <-
12010-11-03 13:19:20 <-
12010-11-03 13:19:20
12010-11-03 13:18:37

Now, if I add 6 hours to the end timestamp (the only change I made to
the query) it returns this:

30   2010-11-03 19:58:55
209  2010-11-03 19:40:00
95   2010-11-03 19:28:28
12010-11-03 19:04:30
52010-11-03 19:02:19
82010-11-03 19:02:19
42010-11-03 19:02:18
22010-11-03 18:44:02
765  2010-11-03 18:21:44
52010-11-03 18:09:22
22010-11-03 18:09:22
12010-11-03 17:50:19
12010-11-03 17:33:39
12010-11-03 15:54:57
52010-11-03 15:02:14
12010-11-03 15:02:00
22010-11-03 14:50:19
12010-11-03 13:41:50
12010-11-03 13:19:20
12010-11-03 13:18:37


Where did 13:41:46, 13:19:20 and 13:22:39  go?

Thanks!
-- 
Paul Halliday
Ideation | Individualization | Learner | Achiever | Analytical
http://www.pintumbler.org

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Re: Can this query be done w/o adding another column?

2010-10-13 Thread Paul Halliday
After bashing at this for a while with no luck I replaced the "inner" with
"left" and I got the desired result.

Thanks for the help.

On Tue, Oct 12, 2010 at 5:18 PM, Travis Ard  wrote:

> Sorry, try changing the column mappings.ip to use the table aliases (m.ip
> and m2.ip).
>
>
>
> -Travis
>
>
>
> From: Paul Halliday [mailto:paul.halli...@gmail.com]
> Sent: Tuesday, October 12, 2010 11:37 AM
> To: Travis Ard
> Cc: mysql@lists.mysql.com
> Subject: Re: Can this query be done w/o adding another column?
>
>
>
> On Tue, Oct 12, 2010 at 1:59 PM, Travis Ard 
> wrote:
>
> You could join your mappings table twice, once on src_ip and again on
> dst_ip:
>
> SELECT COUNT(signature) AS count,
>
>  MAX(timestamp) AS maxTime,
>  INET_NTOA(src_ip),
>
>  m.cc as src_cc,
>  INET_NTOA(dst_ip),
>  m2.cc as dst_cc,
>
>  signature,
>  signature_id,
>  ip_proto
> FROM event
>
> INNER JOIN mappings m ON event.src_ip = mappings.ip
> INNER JOIN mappings m2 ON event.dst_ip = mappings.ip
>
> WHERE timestamp BETWEEN "2010-10-12 03:00:00" AND "2010-10-13 03:00:00"
>
> GROUP BY INET_NTOA(src_ip), m.cc, INET_NTOA(dst_ip), m2.cc, signature,
> signature_id, ip_proto
>
> ORDER BY maxTime DESC
> LIMIT 10;
>
> -Travis
>
>
>
>
> I get an error: ERROR 1054 (42S22): Unknown column 'mappings.ip' in 'on
> clause'
>
>


-- 
Paul Halliday
Ideation | Individualization | Learner | Achiever | Analytical
http://www.pintumbler.org


Re: Can this query be done w/o adding another column?

2010-10-12 Thread Paul Halliday
On Tue, Oct 12, 2010 at 1:59 PM, Travis Ard  wrote:

> You could join your mappings table twice, once on src_ip and again on
> dst_ip:
>
> SELECT COUNT(signature) AS count,
>MAX(timestamp) AS maxTime,
>   INET_NTOA(src_ip),
>m.cc as src_cc,
>   INET_NTOA(dst_ip),
>   m2.cc as dst_cc,
>signature,
>   signature_id,
>   ip_proto
> FROM event
> INNER JOIN mappings m ON event.src_ip = mappings.ip
> INNER JOIN mappings m2 ON event.dst_ip = mappings.ip
> WHERE timestamp BETWEEN "2010-10-12 03:00:00" AND "2010-10-13 03:00:00"
> GROUP BY INET_NTOA(src_ip), m.cc, INET_NTOA(dst_ip), m2.cc, signature,
> signature_id, ip_proto
> ORDER BY maxTime DESC
> LIMIT 10;
>
> -Travis
>
>
I get an error: ERROR 1054 (42S22): Unknown column 'mappings.ip' in 'on
clause'


Can this query be done w/o adding another column?

2010-10-12 Thread Paul Halliday
Geez, really taking advantage of the list today :). This one is a little
more complicated, well, in my head anyway.

Same tables as before, event and mappings. Mappings is just IP to Country
info. I want to be able to join both a src and dst but the problem is the
mappings table just has one ip column.

My initial query looks like this:

SELECT COUNT(signature) as count, MAX(timestamp) AS maxTime,
INET_NTOA(src_ip), INET_NTOA(dst_ip), signature, signature_id, ip_proto FROM
event WHERE timestamp BETWEEN "2010-10-12 00:00:00" AND "2010-10-13
00:00:00" GROUP BY src_ip, dst_ip, signature,ip_proto ORDER BY maxTime DESC;

Which would return something like this:

2 | 2010-10-12 16:34:17 | 10.1.2.3 | 173.193.202.69 | ET P2P Vuze BT UDP
Connection | 2010144 | 17

Now I want to add the country info into the mix. I have made it this far:

SELECT COUNT(signature) as count, MAX(timestamp) AS maxTime,
INET_NTOA(src_ip), mappings.cc, INET_NTOA(dst_ip), mappings.cc, signature,
signature_id, ip_proto FROM event INNER JOIN mappings ON event.src_ip =
mappings.ip OR event.dst_ip = mappings.ip WHERE timestamp BETWEEN
"2010-10-12 03:00:00" AND "2010-10-13 03:00:00" GROUP BY src_ip, dst_ip,
signature,ip_proto ORDER BY maxTime DESC LIMIT 10;

gives me:
2 | 2010-10-12 16:34:17 | 10.1.2.3 | US | 173.193.202.69 | US | ET P2P Vuze
BT UDP Connection | 2010144 | 17

which obviously isn't right ;) but is close. I was just going to change the
columns in the mappings table to have src_ip and dst_ip just duplicating the
ip column but I have a nagging feeling that that probably isn't necessary.

Thanks.

-- 
Paul Halliday
Ideation | Individualization | Learner | Achiever | Analytical
http://www.pintumbler.org


Re: Can this be done with a single query?

2010-10-12 Thread Paul Halliday
On Tue, Oct 12, 2010 at 11:14 AM, Johnny Withers wrote:

> I would try:
>
> SELECT DISTINCT(e.src_ip)
> FROM event AS e
> LEFT JOIN mappings AS m ON e.src_ip=m.src_ip
> WHERE e.timestamp BETWEEN '2010-10-11 00:00:00' AND '2010-10-12 00:00:00'
>  AND INET_NTOA(e.src_ip) NOT LIKE '10.%.%.%'
> AND INET_NTOA(e.src_ip) NOT LIKE '172.16.%.%'
> AND INET_NTOA(e.src_ip) NOT LIKE '192.168.%.%'
> AND m.src_ip IS NULL
> ;
>
> I would also modify the where clause to use:
>
> AND src_ip NOT BETWEEN INET_ATON('10.0.0.0') AND INET_ATON(10.255.255.255)
> AND src_ip NOT BETWEEN INET_ATON('172.16.0.0') AND
> INET_ATON(172.16.255.255)
> AND src_ip NOT BETWEEN INET_ATON('192.168.0.0') AND
> INET_ATON(192.168.255.255)
>
> instead of
>
> AND INET_NTOA(src_ip) NOT LIKE '10.%.%.%'
> AND INET_NTOA(src_ip) NOT LIKE '172.16.%.%'
> AND INET_NTOA(src_ip) NOT LIKE '192.168.%.%
>
> You should also ensure there is an index on src_ip in events and mappings
> tables.
>
> Using the INET_NTOA() function on the src_ip column will prevent index
> usage during the query.
>

This and the suggestion by Nathan both work.

Thanks for the help!


Can this be done with a single query?

2010-10-12 Thread Paul Halliday
I have 2 tables: events and mappings.

what I want to do is something like:

SELECT DISTINCT(src_ip) FROM event WHERE timestamp BETWEEN '2010-10-11
00:00:00' AND '2010-10-12 00:00:00' AND INET_NTOA(src_ip) NOT LIKE
'10.%.%.%' AND INET_NTOA(src_ip) NOT LIKE '172.16.%.%' AND INET_NTOA(src_ip)
NOT LIKE '192.168.%.%';

but, within that somewhere also check to see if src_ip exists in mappings.
If it does, do not return it.

Is this possible?

Thanks.
-- 
Paul Halliday
Ideation | Individualization | Learner | Achiever | Analytical
http://www.pintumbler.org


Is conversion required?

2010-09-29 Thread Paul Halliday
I just converted (reinstalled) a FreeBSD system from i386 to amd64. Of
course; I missed the memo. I have been struggling to get everything back
online. I just finished exporting a few Gigs of RRD's to XML so that I could
use them :|

My question: I was s/rushing/stupid so I just moved /var/mysql to a
partition (i386) and reinstalled. Can I just copy this back or does some
magic need to happen first?

Thanks!
-- 
Paul Halliday
Ideation | Individualization | Learner | Achiever | Analytical
http://www.pintumbler.org


Re: Join on a where clause.

2009-12-10 Thread Paul Halliday
On Wed, Dec 9, 2009 at 2:24 PM, Joerg Bruehe  wrote:
> Hi everybody!
>
>
> Neil Aggarwal wrote:
>> Paul:
>>
>>> SELECT COUNT(event.src_ip) AS count, INET_NTOA(event.src_ip),
>>> mappings.cc FROM event, mappings WHERE event.timestamp BETWEEN
>>> '2009-12-06 20:00' and '2009-12-07 20:00:00' AND event.src_ip BETWEEN
>>> 'mappings.start_ip' AND 'mappings.end_ip' GROUP BY event.src_ip ORDER
>>> BY count DESC LIMIT 20;
>
> I am surprised by the quotes you have around the "start_ip" and "end_ip"
> columns; to me, this makes that look like strings.

That because I don't know what I am doing :). No quotes on integers; got it!

>
> From your posting, I see the result you hope to get but not the one you
> actually get. IMO, just dropping the single quotes around the two column
> names should produce the data you want to get.
>
> Or what is the result you receive?

Removing the quotes does work. The query however took 1h15m to complete. Yuck.

I am guessing this is because even though there is a limit, it is
still doing the lookup on everything past 20. Also, if the first
address has a count of say 2000, it would be doing the lookup 2000
times for a single address.

Is that right?

If it is I guess I will have to post process the results. Which is
fine, I just like to keep as much in the queries as I can.

Thanks.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Join on a where clause.

2009-12-09 Thread Paul Halliday
I have 2 tables:

1) Event Data
2) Mappings

The query should return something like this:

Hits  IP  Country Code
20213.136.52.29 SE

I am trying this:

SELECT COUNT(event.src_ip) AS count, INET_NTOA(event.src_ip),
mappings.cc FROM event, mappings WHERE event.timestamp BETWEEN
'2009-12-06 20:00' and '2009-12-07 20:00:00' AND event.src_ip BETWEEN
'mappings.start_ip' AND 'mappings.end_ip' GROUP BY event.src_ip ORDER
BY count DESC LIMIT 20;

Am I supposed to do a join somewhere? Do joins even apply in a where
clause? or am I totally off the mark.

Singularly, the queries look like this:

SELECT cc FROM mappings WHERE INET_ATON('src_ip') BETWEEN start_ip AND end_ip;

SELECT COUNT(src_ip) AS count, INET_NTOA(src_ip) FROM event WHERE
timestamp BETWEEN '2009-12-06 20:00' and '2009-12-07 20:00:00' GROUP
BY src_ip ORDER BY count DESC LIMIT 20;


Thanks.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



Table advice.

2009-08-01 Thread Paul Halliday
I have a database that I am (will) be using to track URL's. The table
structure looks like this:

CREATE TABLE event
(
  eid   INT UNSIGNED NOT NULL AUTO_INCREMENT,
  timestamp INT(10) UNSIGNED NOT NULL DEFAULT 0,
  ipINT(10) UNSIGNED NOT NULL DEFAULT 0,
  fqdn  VARCHAR(255),
  domainVARCHAR(63),
  tld   VARCHAR(63),
  actionVARCHAR(4),
  request   TEXT,
  referrer  TEXT,
  clientVARCHAR(255),
  INDEX eid (eid),
  INDEX timestamp (timestamp),
  INDEX ip (ip),
  INDEX fqdn (fqdn),
  INDEX domain (domain),
  INDEX tld (tld)
);

The is no real logic behind the indexes, the table was hobbled
together looking at examples. Currently I am trying queries on about
300 million records and the results are pretty crappy. for example, a
query like this:

select domain,count(domain) as count from event where domain like
'%facebook%' group by domain order by count desc;

takes about 5 minutes to complete.

Most of the queries will be like that above but probably with
additional filters like date constraints or IP constraints or a
mixture of both. I can also see searches through the requests for
filetypes etc.

Any suggestions or comments would be appreciated.

Thanks.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/mysql?unsub=arch...@jab.org



open_files_limit problem.

2007-10-11 Thread Paul Halliday
I am trying to change this value and it doesn't seem to work.

Looking at the processes I have:

mysql   21752  0.0  0.1  1652  1092  p3  I 3:50PM   0:00.01
/bin/sh /usr/local/bin/mysqld_safe
--defaults-extra-file=/var/db/mysql/my.cnf --user=mysql
--datadir=/var/db/mysql
--pid-file=/var/db/mysql/nsccmadb2.campus.nscc.ca.pid
mysql   21770  0.0  1.3 58188 26168  p3  S 3:50PM   0:00.21
/usr/local/libexec/mysqld --defaults-extra-file=/var/db/mysql/my.cnf
--basedir=/usr/local --datadir=/var/db/mysql
--pid-file=/var/db/mysql/nsccmadb2.campus.nscc.ca.pid


Within /var/db/mysql/my.cnf I have:

[mysqld_safe]
open_files_limit = 32768

[mysqld]
open_files_limit = 32768

But when I try something like "mysql> show variables like '%files%';":

I get:

open_files_limit 11095

Is there another variable that needs to be adjusted to bump this up?

Thanks.

-- 
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Strange query.

2007-01-11 Thread Paul Halliday

Could you expand a little on how that works?

I am starting to think that I wont be able to get the results I need
without post processing.

Looking at the results:
...
| 2007-01-09 20:02:15 |
| 2007-01-09 20:02:15 |
| 2007-01-09 20:03:20 |
| 2007-01-09 20:08:33 |
| 2007-01-09 20:08:33 |
| 2007-01-09 20:12:19 |
| 2007-01-09 20:18:59 |
| 2007-01-09 20:23:03 |
| 2007-01-09 20:50:11 |
| 2007-01-09 20:50:11 |
| 2007-01-09 20:50:12 |
| 2007-01-09 20:50:12 |
| 2007-01-09 20:50:12 |
| 2007-01-09 20:50:12 |
| 2007-01-10 01:16:09 |
| 2007-01-10 01:18:31 |
| 2007-01-10 03:00:22 |
| 2007-01-10 03:09:23 |
| 2007-01-10 03:12:23 |
| 2007-01-10 03:15:23 |
| 2007-01-10 03:18:23 |
| 2007-01-10 03:30:24 |

(the results are GMT, I need AST, hence the day change)

All I am interested in is returning 24 values so with these results I
would have something like:
...
20 - 14
21 - 0
22 - 0
23 - 0
0 - 0
1 - 2
2 - 0
3 - 6

Is this possible?



On 10 Jan 2007 19:55:09 -, Felix Geerinckx
<[EMAIL PROTECTED]> wrote:

[EMAIL PROTECTED] ("Paul Halliday") wrote in
news:[EMAIL PROTECTED]:


> I am trying to deal with empty values so that I can graph data over a
> 24 hour period without gaps.

Have a look here:
http://forums.mysql.com/read.php?10,133603,133607#msg-133607

--
felix

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Strange query.

2007-01-10 Thread Paul Halliday

That query doesn't return empty values. Just to clarify what I want as
the result:

My initial query was this,

mysql> select count(*),DATE_FORMAT(timestamp,'%H') AS hour FROM event
WHERE timestamp between '2007-01-10 04:00:00' and '2007-01-11
04:00:00' AND sid=1 GROUP BY hour;
+--+--+
| count(*) | hour |
+--+--+
|4 | 04   |
|5 | 06   |
|5 | 07   |
|1 | 08   |
|7 | 09   |
|   12 | 10   |
|   73 | 12   |
|   31 | 13   |
|   50 | 14   |
+--+--+
9 rows in set (0.03 sec)

What I am looking for is  0's for every empty result and up to the end
of the day.

Thanks.

On 1/10/07, Brent Baisley <[EMAIL PROTECTED]> wrote:

You can't join on the result of calculations in the field selection. The result 
is not associated with any table. So the problem
isn't so much with the date_format statement, but that you are joining on 
HOURS.hour the timestamp, not HOURS.hour the DATE_FORMAT.
I would think you would be getting an error when you run your SELECT.
Your group by can use the result of a calculation. So you may actually have two 
problems, since you are grouping on HOURS.hour, the
timestamp, the 'hour' the alias name for the calculation result.
I'm not sure why you don't just pull the hour from the timestamp either.

SELECT COUNT(*), HOUR(timestamp) AS hour FOM HOURS
LEFT JOIN event ON HOURS.hour=HOUR(timestamp)
WHERE timestamp BETWEEN '2007-01-09 04:00:00'
AND '2007-01-10 04:00:00' AND sid=2 GROUP BY hour

- Original Message -
From: "Paul Halliday" <[EMAIL PROTECTED]>
To: 
Sent: Wednesday, January 10, 2007 8:39 AM
Subject: Strange query.


> Hi,
>
> I am trying to deal with empty values so that I can graph data over a
> 24 hour period without gaps.
>
> I created a table called HOURS which simply has 0->23 and I am trying
> to do a join on this to produce the desired results. I think that the
> DATE_FORMAT in the query is screwing things up.
>
> The query looks something like this:
>
> SELECT COUNT(*),DATE_FORMAT(timestamp,'%H') AS hour FROM HOURS LEFT
> JOIN event ON HOURS.hour=hour WHERE timestamp BETWEEN '2007-01-09
> 04:00:00' AND '2007-01-10 04:00:00' AND sid=2 GROUP BY HOURS.hour;
>
> Any help would be appreciated.
>
> Thanks.
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>




--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Strange query.

2007-01-10 Thread Paul Halliday

Hi,

I am trying to deal with empty values so that I can graph data over a
24 hour period without gaps.

I created a table called HOURS which simply has 0->23 and I am trying
to do a join on this to produce the desired results. I think that the
DATE_FORMAT in the query is screwing things up.

The query looks something like this:

SELECT COUNT(*),DATE_FORMAT(timestamp,'%H') AS hour FROM HOURS LEFT
JOIN event ON HOURS.hour=hour WHERE timestamp BETWEEN '2007-01-09
04:00:00' AND '2007-01-10 04:00:00' AND sid=2 GROUP BY HOURS.hour;

Any help would be appreciated.

Thanks.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: If exists query.

2006-04-24 Thread Paul Halliday
On 4/23/06, John Hicks <[EMAIL PROTECTED]> wrote:
> Paul Halliday wrote:

> > I am doing queries that produce a table that looks something like this:
> >
> > Count   | IP Address| First Seen| Last Seen | Days
> > 5000  10.0.0.1   2005-12-102006-04-15  50*
> > 6500  10.0.0.2   2006-04-012006-04-06  3**
> >
> > *So of the 5000 events "count(*) between 2005-12-10 and 2006-04-14"
> > the count was distributed within 50 distinct days.
> >
> > **This time the count is still high and it occured within 3 days
> > between 2006-04-01 and 2006-04-06.
> >
> > I just cant figure out how to come up with the days part :).
>
> Well, I guess that is a *bit* more specific, but not much better :)
>
> So I'm guessing that the source of this data is perhaps a web access log
> and that you are tracking IP addresses of visitors.
> Can we tickle a little more information out of you?

Sorry,

Ok. The data is IDS events. I am not trying to create any new
information I just want to extract information. This information will
be used to relay whether a particular machine has ongoing issues.

For example,

SELECT COUNT(src_ip) AS CNT, INET_NTOA(src_ip), MAX(timestamp) FROM
event WHERE timestamp > '2006-04-24 03:00:00' AND sid="1" AND
signature_id<>"1" GROUP BY src_ip ORDER BY cnt DESC LIMIT 10

This will give me the top 10 source addresses for today based on how
many events they have triggered.

If they make the top ten, I want to see when we first saw that address:

SELECT MIN(timestamp) FROM event WHERE src_ip=INET_ATON('')

I now want to know; out of all of the days between first seen and last
seen which days had events on them. I dont want the count(events) for
eah day, just whether there was an event or not so that I know between
first seen and last seen what the rate of appearance was.

I could do something crufty like this (the row count would be the
answer I am looking for):

SELECT DATE_FORMAT(timestamp,'%Y-%m-%d %a') AS DAY, COUNT(*) FROM
event WHERE timestamp BETWEEN '2006-01-01' AND '2006-04-24' and
src_ip=INET_NTOA('10.0.0.1') OR dst_ip=INET_ATON('10.0.0.1') GROUP BY
DAY;

But that seems like a lot of extra processing.

Thanks and sorry for the confusion.

> Ideally, it would be nice to know what task you are trying to accomplish.
>
> What is the source of your data?
>
> What is the "condition" you are testing for?
>
> And what, very specifically, is it that you would like us to help you with.
>
> --John
>
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: If exists query.

2006-04-23 Thread Paul Halliday
I am doing queries that produce a table that looks something like this:

Count   | IP Address| First Seen| Last Seen | Days
5000  10.0.0.1   2005-12-102006-04-15  50*
6500  10.0.0.2   2006-04-012006-04-06  3**

*So of the 5000 events "count(*) between 2005-12-10 and 2006-04-14"
the count was distributed within 50 distinct days.

**This time the count is still high and it occured within 3 days
between 2006-04-01 and 2006-04-06.

I just cant figure out how to come up with the days part :).

On 4/23/06, John Hicks <[EMAIL PROTECTED]> wrote:
> Paul Halliday wrote:
> > I am trying to formulate a query that will increment if a condition is true.
> >
> > For example, if I do a select (just let me know if there was data on
> > this day, if so increment count by 1 and check the next day) where
> > timestamp between jan and feb.
>
> Could you be a bit more specific?
>
> Where do you want to store the count?
>
> What do you mean by "if there was data on this day"? Does the design of
> the table anticipate no more than one record per day? Does each record
> correspond to a single day?
>
> --John
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



If exists query.

2006-04-21 Thread Paul Halliday
I am trying to formulate a query that will increment if a condition is true.

For example, if I do a select (just let me know if there was data on
this day, if so increment count by 1 and check the next day) where
timestamp between jan and feb.

Thanks.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Can't search words of three letters?

2006-04-12 Thread Paul Halliday
Why not try regex?

http://dev.mysql.com/doc/refman/4.1/en/regexp.html

On 4/12/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Hi to all,
> I have to build a search of products on web catalog site. It has to search
> a whole words to avoid when somebody search for a pin and get "*pin*k
> shirt".
> I was suggested to use MATCH() AGAINST() and it works perfect - until I
> dscovered that I'm getting alwayes 0 results if search for cap, pen, mug -
> three-letter words. Since, these are one of the most searched words (we
> are in promotin industry) it's a big problem for me.
> I found on http://dev.mysql.com/doc/refman/4.1/en/fulltext-boolean.html
> that the default value of minimum character length is 4?!?!?
>
> Does anybody solved this problem?
> is there any other solution to suggest?
>
> Thanks for any help.
>
> -afan
>
>
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
>
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Insuring select returns the last record for a given day.

2006-03-27 Thread Paul Halliday
Hi,

I have a table that looks somthing like this:

ID   timestampcampusIDS ePOinbound
outbound   statinfo

2289411143430287MA0 0424526713
284590944  0  NULL
2289401143430002ST 2 0290248558
119939485  0  NULL
2289391143430290AN122001697436588   
20836217840  NULL

I am trying to read the last record for each column for a given day
(when the stats are input they are already a sum, so the last entry
will be the cumulative total for each day). My query looks like this:

select campus,date_format(from_unixtime(timestamp),'%Y-%m-%d') as
day,inbound,outbound,IDS,ePO from stats where
date_sub(curdate(),interval 7 day) <= from_unixtime(timestamp) and
campus='MA' group by day;

campus day   inbound  outbound  IDS ePO
MA2006-03-21618584262358793491974   0
MA2006-03-2285702498739824446572 0
MA2006-03-238745243413847904889 50
MA2006-03-244856668982854718766  1   0
MA2006-03-25798980766 250534732  00
MA2006-03-26424526713 284590944  00
MA2006-03-27144573737 59843102  790

So this query returns the last value for each day for the past 7 days
grouped by day. Tentative testing shows that the values are indeed
always the last entries but is this the right way to pull off this
query? I dont see anything specific in the query itself to insure it
extracts the last record for said day.

Is this correct or should the query have further processing.

Thanks.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Merge tables.

2006-03-14 Thread Paul Halliday
On 3/14/06, Martijn Tonies <[EMAIL PROTECTED]> wrote:
> Hello Paul,
>
> I suggest you reply to the mailinglist :-) ...
>
> > The developer insists that for scalability issues, this was the
> > answer. It is likely, for example in my deployment, that these tables
> > would see upwards of 10 million records or more.
>
> Well, if there are problems with scalability, I guess you could
> split it up in a few (not 1600) tables and have them avaialble
> on different physical hard drives...
>

As an example:

There was a table called event.

This table is now broken up like this:

event __.

So for every sensor, and every day, there is now a new table. So if I
have 20 sensors, every day I will have 20 new tables.

With this in mind, does this design make sense?

how will this scale?

Is there anything I can do through configuration (I doubt the
developer will change the design) to speed things up? or a workaround
that I could do on my end to compensate?

Thanks.

> But -> why try to fix something that ain't broken (yet)?
>
> Were you experiencing problems already? If the application
> is fast WITHOUT merge tables, why bother?
>
> Martijn Tonies
> Database Workbench - development tool for MySQL, and more!
> Upscene Productions
> http://www.upscene.com
> My thoughts:
> http://blog.upscene.com/martijn/
> Database development questions? Check the forum!
> http://www.databasedevelopmentforum.com
>
> > >
> > > > One of the databases I use just switched to using merge tables and now
> > > > my queries are painfully slow. One table, initially had about 2.5
> > > > million records and now with the change this information is spread
> > > > across about 1600 tables. A simple query, say select count(*) has gone
> > > > from .04 to about 30 seconds, sometimes even longer.
> > >
> > > Why on earth would you spread this information across 1600 (!!!)
> > > tables? That's 1600 files to maintain instead of 1.
>
>

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Merge tables.

2006-03-14 Thread Paul Halliday
Hi,

One of the databases I use just switched to using merge tables and now
my queries are painfully slow. One table, initially had about 2.5
million records and now with the change this information is spread
across about 1600 tables. A simple query, say select count(*) has gone
from .04 to about 30 seconds, sometimes even longer.

The OS is FreeBSD on a 3.1 GHz machine with a 1GB ram. Mysql is
4.1.15. My my.cnf currently has only one option:

open_files_limit=24576

If anyone could provide me with some other options I might try I would
appreciate it.

Thanks.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



problem with excluded days in date range query.

2006-02-03 Thread Paul Halliday
I have the following query:

select count(*) as cnt, date(timestamp) as day from table where
date_sub(curdate(),interval 14 day) <= timestamp group by day;

If there are no data for a particular day, it is not included. Is
there a way to include all days even if the result is 0? I saw an
example that included another table with all dates and an inner join
but that seems a little clunky.

Thanks.

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Conditional insert or update.

2005-07-20 Thread Paul Halliday
Ok, 

The entire table looks like this:

ip | hostname | mac_current | mac_change | port_current | port_change

The IP addresses are harvested via netflow (a different table) and or
arpwatch (an event). This new table is to augment the current
information that is within the netflow tables but specifically
targetting host info.

There is a shell script that runs every 5 minutes, or as triggered by
arpwatch. The former simply selects all the addresses from the netflow
table and pairs them up with it's MAC address which are extracted from
the flat-file arpwatch database. If arpwatch sees a new address it
will also input the ip/mac as it occurs. The MAC addresses are then
used as arguments for snmp lookups on the switches which will return
which port they are currently plugged into.

In the event that a known mac appears on a different port or belongs
to a different address I want to make note of the change. I just
thought it would be easier to compare the mac_current mac_change
port_current port_change via a select query as opposed to having to do
full selects on the entire table then compare the entries before I
plug them in.

For example, the on duplicate key could work if it was available for
update. It would see that that the ip exists (primary key) which would
mean that mac_current already exists so update mac_new.

Or something like that.. :)

mysql> INSERT INTO table (a,b,c) VALUES (1,2,3)
-> ON DUPLICATE KEY UPDATE c=c+1;

On 7/20/05, Gleb Paharenko <[EMAIL PROTECTED]> wrote:
> Hello.
> 
> 
> 
> 
> 
> Please, provide more information on your's application logic.
> 
> To catch events which occurs for the table use triggers. See:
> 
>   http://dev.mysql.com/doc/mysql/en/triggers.html
> 
> 
> 
> Unfortunately they're available only in MySQL 5.0.x, which is still
> 
> beta.
> 
> 
> 
> 
> 
> Paul Halliday <[EMAIL PROTECTED]> wrote:
> 
> > Hi,=20
> 
> >
> 
> > I have a table that looks something like this:
> 
> >
> 
> > ip | mac_current | mac_change
> 
> >
> 
> > now if I have an entry that looks like this
> 
> >
> 
> > 10.0.0.1 | 11:11:11:11:11:11:11 | NULL
> 
> >
> 
> > If the next time the script runs and the mac has changed, how can I
> 
> > catch this and put the new MAC into mac_change?
> 
> >
> 
> > Ultimately, I would also like to roll the values too. For example if
> 
> > the MAC changes again, put mac_change into mac_current...
> 
> >
> 
> >
> 
> > Thanks.
> 
> > --=20
> 
> >
> 
> > Paul Halliday
> 
> >
> 
> > "Diplomacy is the art of saying "Nice doggie!" till you can find a rock."
> 
> >
> 
> 
> 
> --
> For technical support contracts, goto https://order.mysql.com/?ref=ensita
> This email is sponsored by Ensita.NET http://www.ensita.net/
>__  ___ ___   __
>   /  |/  /_ __/ __/ __ \/ /Gleb Paharenko
>  / /|_/ / // /\ \/ /_/ / /__   [EMAIL PROTECTED]
> /_/  /_/\_, /___/\___\_\___/   MySQL AB / Ensita.NET
><___/   www.mysql.com
> 
> 
> 
> 
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
> 
> 


-- 
_
Paul Halliday
http://dp.penix.org

"Diplomacy is the art of saying "Nice doggie!" till you can find a rock."

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Conditional insert or update.

2005-07-20 Thread Paul Halliday
Hi, 

I have a table that looks something like this:

ip | mac_current | mac_change

now if I have an entry that looks like this

10.0.0.1 | 11:11:11:11:11:11:11 | NULL

If the next time the script runs and the mac has changed, how can I
catch this and put the new MAC into mac_change?

Ultimately, I would also like to roll the values too. For example if
the MAC changes again, put mac_change into mac_current...


Thanks.
-- 

Paul Halliday

"Diplomacy is the art of saying "Nice doggie!" till you can find a rock."

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Solution to slow queries (Resolved, kinda)

2005-05-12 Thread Paul Halliday
Hi,

First of all, thanks to everyone that provided pointers on this matter.
The route I chose to take was to make 2 tables. One is for cumulative
network stats; this table can be used for the weekly,monthly,yearly
reports. I also created a table for daily stats which will be dropped
at midnight each day.

So I whipped up a simple shell script that looks like this:

mysql -B --user=flow --password=hi flowdb -e "DROP TABLE traffic"
mysql -B --user=flow --password=hi flowdb < /home/flow/Code/create_flowdb.sql

and a crontab entry that looks like:

0   0   *   *   *   /home/flow/Code/db_rollover.sh
> /dev/null 2>&1

the report entries look like this:

*/5 *   *   *   *  
/home/flow/Reports/incident_report.tcl > /dev/null 2>&1

*/20*   *   *   *  
/home/flow/Reports/traffic_report.tcl > /dev/null 2>&1

Now looking at crons log from last night I see:

May 12 00:00:00 watcher cron[84039]: (flow) CMD
(/home/flow/Code/db_rollover.sh > /dev/null 2>&1)
May 12 00:00:00 watcher cron[84040]: (flow) CMD
(/home/flow/Reports/traffic_report.tcl > /dev/null 2>&1)
May 12 00:00:00 watcher cron[84041]: (flow) CMD
(/home/flow/Reports/incident_report.tcl > /dev/null 2>&1)

So the script did indeed run, yet it did not drop the table. The
script works fine from the command line so I guess because the other
programs were running too it could not drop the table? Those scripts
are just doing selects, no updates, but there is the possibility that
the program which populates the db was running at the same time too.

Is there a way to force the table drop? (Without adding checks to the
shell script)


Thanks.

_
Paul Halliday
http://dp.penix.org

"Diplomacy is the art of saying "Nice doggie!" till you can find a rock."

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Re: Solution to slow queries

2005-05-10 Thread Paul Halliday
On 5/10/05, Roger Baklund <[EMAIL PROTECTED]> wrote:
> Paul Halliday wrote:
> >   srcaddr   VARCHAR(15),
> >   dstaddr   VARCHAR(15),
> 
> Are these ip-adresses? If they are, consider using UNSIGNED INT columns
> and the INET_NTOA() and INET_ATON() funtions. It will save you a lot of
> space, thus increase the amount of data your hw can handle.

They are indeed ip addresses. This infomation is gathered and input
into the db via a program called flow-export (export netflows). I
intially had the column as UNSIGNED INT but it would only pick up the
first octet, so I switched to VARCHAR.


> 
> > I have read up a bit on merge tables (this is probably the answer) but
> > I am unsure as to how you trigger the changes. ie, how do you do the
> > rollover after every 24hours?
> 
> You would have to program this yourself, there are no mechanisms for
> this in MySQL. It's pretty straight forward, though. Just use a shell
> script and cron or the equivalent if you are on a non-unix platform.
> 
> --
> Roger
> 
> 
> --
> MySQL General Mailing List
> For list archives: http://lists.mysql.com/mysql
> To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]
> 
> 


-- 
_
Paul Halliday
http://dp.penix.org

"Diplomacy is the art of saying "Nice doggie!" till you can find a rock."

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]



Solution to slow queries

2005-05-10 Thread Paul Halliday
Hello,

I am working on a database that deals with network statistics. I have
a program that generates web reports based on this data every ten
minutes.

The table layout looks something like this:

CREATE TABLE traffic
(
  unix_secs INT UNSIGNED NOT NULL,
  dpkts INT UNSIGNED NOT NULL DEFAULT 0,
  doctets   INT UNSIGNED NOT NULL DEFAULT 0,
  first INT UNSIGNED,
  last  INT UNSIGNED,
  srcaddr   VARCHAR(15),
  dstaddr   VARCHAR(15),
  srcport   SMALLINT UNSIGNED,
  dstport   SMALLINT UNSIGNED,
  prot  TINYINT UNSIGNED NOT NULL DEFAULT 0,
  tos   TINYINT UNSIGNED NOT NULL DEFAULT 0,
  tcp_flags TINYINT UNSIGNED NOT NULL DEFAULT 0,
  INDEX unix_secs (unix_secs),
  INDEX srcaddr (srcaddr),
  INDEX dstaddr (dstaddr),
  INDEX srcport (srcport),
  INDEX dstport (dstport)
);

Now, as time progresses the queires are getting slower and slower.
I know this is expected, so I am curious as to how I can have a main
table that has all traffic, so that I can do monthly/yearly reports,
and  also have a daily table so that I can quickly do reports every
minute or so on that data.

I have read up a bit on merge tables (this is probably the answer) but
I am unsure as to how you trigger the changes. ie, how do you do the
rollover after every 24hours?

Any thoughts, or a pointer in the right direction would be greatly appreciated. 


Thanks.
 
-- 
_
Paul Halliday
http://dp.penix.org

"Diplomacy is the art of saying "Nice doggie!" till you can find a rock."

--
MySQL General Mailing List
For list archives: http://lists.mysql.com/mysql
To unsubscribe:http://lists.mysql.com/[EMAIL PROTECTED]