[twitter-dev] Re: Read and store Twitter responses

2009-04-21 Thread Nick Arnett
On Mon, Apr 20, 2009 at 5:59 PM, Joseph northwest...@gmail.com wrote:


 This may not be the best thing to do in the case of statuses.
 Optimization implies that you have two tables (minimum), one for the
 user info, and one for the tweets. Doing a batch update, means that
 you're skipping the step of checking to see if the user is already in
 the database, so for every tweet, you will add the same user again.
 That will you will slow you down much more than the batch advantage,
 and will create confusion (unless you store all in one table, and
 that's even more burdensome).


There are a couple of ways to deal with this. Given sufficient memory, keep
a hash of userIDs in memory and only insert the new ones.  If memory
consumption is a problem, assuming that the userID as the primary key in the
user table,  do an INSERT IGNORE for all of the users.  With userID indexed,
that will be quite fast.

It won't be that simple if you have foreign key constraints, but I can't
imagine referential integrity is critical for this sort of application.

My system is far more constrained by things other than the insert speeds.

Nick


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread CWitt

Is there anywhere I could take a look at some of this code to store
the Twitter data in a MySQL databases?

On Apr 19, 8:50 pm, Nick Arnett nick.arn...@gmail.com wrote:
 On Sun, Apr 19, 2009 at 2:45 PM, CWitt wittma...@gmail.com wrote:

  My skills are rather limited, but I was thinking PHP and MySQL. I was
  thinking about hiring it out, but putting together the process flow to
  help the programmer and also help me find the correct programmer.

 PHP and MySQL sound appropriate to what you're hoping to do.  Storing
 Twitter data in MySQL is generally not a big deal, since there is such
 limited data.  A lot of us have probably created similar schemas for that
 purpose.  The rest of your code sounds slightly more complex, especially if
 you're trying to do some sort of natural language parsing, which is always
 hard.  I don't know if there are libraries in PHP for that purpose.  There
 are in other languages.

 In any case, without specifics, it would be hard for anyone to guide you.

 Nick


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Andrew Badera
This isn't a SQL tutorial nor a MySQL list. Some might suggest you'd be
better off learning the basics of what you're trying to do -- learning how
to walk before you can run and all that.

Thanks-
- Andy Badera
- and...@badera.us
- Google me: http://www.google.com/search?q=andrew+badera



On Mon, Apr 20, 2009 at 4:41 PM, CWitt wittma...@gmail.com wrote:


 Is there anywhere I could take a look at some of this code to store
 the Twitter data in a MySQL databases?

 On Apr 19, 8:50 pm, Nick Arnett nick.arn...@gmail.com wrote:
  On Sun, Apr 19, 2009 at 2:45 PM, CWitt wittma...@gmail.com wrote:
 
   My skills are rather limited, but I was thinking PHP and MySQL. I was
   thinking about hiring it out, but putting together the process flow to
   help the programmer and also help me find the correct programmer.
 
  PHP and MySQL sound appropriate to what you're hoping to do.  Storing
  Twitter data in MySQL is generally not a big deal, since there is such
  limited data.  A lot of us have probably created similar schemas for that
  purpose.  The rest of your code sounds slightly more complex, especially
 if
  you're trying to do some sort of natural language parsing, which is
 always
  hard.  I don't know if there are libraries in PHP for that purpose.
  There
  are in other languages.
 
  In any case, without specifics, it would be hard for anyone to guide you.
 
  Nick



[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Doug Williams
I've broken the task into logical steps to get you started. I'd suggest
searching Google and the wiki [1] for the libraries and implementation
details for each:

1. Download a timeline or a set of statuses.
2. Iterate through that set of statuses, pulling out each individual status.
3. For each status in the set, perform an SQL insert to save the status.

A great way to learn is to try and find sample code that gets you each of
these steps separately, then put them together. There is plenty of PHP and
MySQL sample code available online or in books to get you started.

1. http://apiwiki.twitter.com/Libraries#PHP

Thanks,
Doug Williams
Twitter API Support
http://twitter.com/dougw


On Mon, Apr 20, 2009 at 1:53 PM, Andrew Badera and...@badera.us wrote:

 This isn't a SQL tutorial nor a MySQL list. Some might suggest you'd be
 better off learning the basics of what you're trying to do -- learning how
 to walk before you can run and all that.

 Thanks-
 - Andy Badera
 - and...@badera.us
 - Google me: http://www.google.com/search?q=andrew+badera




 On Mon, Apr 20, 2009 at 4:41 PM, CWitt wittma...@gmail.com wrote:


 Is there anywhere I could take a look at some of this code to store
 the Twitter data in a MySQL databases?

 On Apr 19, 8:50 pm, Nick Arnett nick.arn...@gmail.com wrote:
  On Sun, Apr 19, 2009 at 2:45 PM, CWitt wittma...@gmail.com wrote:
 
   My skills are rather limited, but I was thinking PHP and MySQL. I was
   thinking about hiring it out, but putting together the process flow to
   help the programmer and also help me find the correct programmer.
 
  PHP and MySQL sound appropriate to what you're hoping to do.  Storing
  Twitter data in MySQL is generally not a big deal, since there is such
  limited data.  A lot of us have probably created similar schemas for
 that
  purpose.  The rest of your code sounds slightly more complex, especially
 if
  you're trying to do some sort of natural language parsing, which is
 always
  hard.  I don't know if there are libraries in PHP for that purpose.
  There
  are in other languages.
 
  In any case, without specifics, it would be hard for anyone to guide
 you.
 
  Nick





[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Nick Arnett
On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams d...@twitter.com wrote:

3. For each status in the set, perform an SQL insert to save the status.


Or, I would hope, create an array of inserts and do a multi-insert, which
will be far faster than iterating through a list.

http://www.desilva.biz/mysql/insert.html

I'll bet you knew that, but I just had to note it because the performance
difference is enormous.

Nick
(not really a PHP guy, but years of (often painfully gained) MySQL
performance knowledge)


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Doug Williams
Nick,
Batch INSERTs are great for people looking to for performance tweaks. Serial
INSERT statements within the iteration loop keeps things simple for those
just starting out.

Doug Williams
Twitter API Support
http://twitter.com/dougw


On Mon, Apr 20, 2009 at 4:14 PM, Nick Arnett nick.arn...@gmail.com wrote:



 On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams d...@twitter.com wrote:

 3. For each status in the set, perform an SQL insert to save the status.


 Or, I would hope, create an array of inserts and do a multi-insert, which
 will be far faster than iterating through a list.

 http://www.desilva.biz/mysql/insert.html

 I'll bet you knew that, but I just had to note it because the performance
 difference is enormous.

 Nick
 (not really a PHP guy, but years of (often painfully gained) MySQL
 performance knowledge)



[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Nick Arnett
On Mon, Apr 20, 2009 at 5:41 PM, Doug Williams d...@twitter.com wrote:

 Nick,
 Batch INSERTs are great for people looking to for performance tweaks.
 Serial INSERT statements within the iteration loop keeps things simple for
 those just starting out.


True, of course... and now that I think about it, double-byte characters in
the midst of a failing multi insert can be hard to figure out if you don't
know what you're doing.

Speaking of which, anybody who is getting started in this sort of thing -
setting the default character set in MySQL to UTF-8 (before creating
tables!) will help avoid a lot of confusion and headaches that drove me
slightly nuts and I'm far from a newbie.

Nick


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Joseph

This may not be the best thing to do in the case of statuses.
Optimization implies that you have two tables (minimum), one for the
user info, and one for the tweets. Doing a batch update, means that
you're skipping the step of checking to see if the user is already in
the database, so for every tweet, you will add the same user again.
That will you will slow you down much more than the batch advantage,
and will create confusion (unless you store all in one table, and
that's even more burdensome).

Now, does anyone know if there's some obscure version of UPDATE that
takes parameters to allow me to use UPDATE instead of INSERT (saving
me from the extra step of checking of the person is already in my
database). I'm fairly new to MySQL.

On Apr 20, 4:14 pm, Nick Arnett nick.arn...@gmail.com wrote:
 On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams d...@twitter.com wrote:

 3. For each status in the set, perform an SQL insert to save the status.

 Or, I would hope, create an array of inserts and do a multi-insert, which
 will be far faster than iterating through a list.

 http://www.desilva.biz/mysql/insert.html

 I'll bet you knew that, but I just had to note it because the performance
 difference is enormous.

 Nick
 (not really a PHP guy, but years of (often painfully gained) MySQL
 performance knowledge)


[twitter-dev] Re: Read and store Twitter responses

2009-04-20 Thread Abraham Williams
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html

On Mon, Apr 20, 2009 at 19:59, Joseph northwest...@gmail.com wrote:


 This may not be the best thing to do in the case of statuses.
 Optimization implies that you have two tables (minimum), one for the
 user info, and one for the tweets. Doing a batch update, means that
 you're skipping the step of checking to see if the user is already in
 the database, so for every tweet, you will add the same user again.
 That will you will slow you down much more than the batch advantage,
 and will create confusion (unless you store all in one table, and
 that's even more burdensome).

 Now, does anyone know if there's some obscure version of UPDATE that
 takes parameters to allow me to use UPDATE instead of INSERT (saving
 me from the extra step of checking of the person is already in my
 database). I'm fairly new to MySQL.

 On Apr 20, 4:14 pm, Nick Arnett nick.arn...@gmail.com wrote:
  On Mon, Apr 20, 2009 at 3:16 PM, Doug Williams d...@twitter.com wrote:
 
  3. For each status in the set, perform an SQL insert to save the status.
 
  Or, I would hope, create an array of inserts and do a multi-insert, which
  will be far faster than iterating through a list.
 
  http://www.desilva.biz/mysql/insert.html
 
  I'll bet you knew that, but I just had to note it because the performance
  difference is enormous.
 
  Nick
  (not really a PHP guy, but years of (often painfully gained) MySQL
  performance knowledge)




-- 
Abraham Williams | http://the.hackerconundrum.com
Hacker | http://abrah.am | http://twitter.com/abraham
Web608 | Community Evangelist | http://web608.org
This email is: [ ] blogable [x] ask first [ ] private.
Sent from Madison, Wisconsin, United States


[twitter-dev] Re: Read and store Twitter responses

2009-04-19 Thread CWitt

My skills are rather limited, but I was thinking PHP and MySQL. I was
thinking about hiring it out, but putting together the process flow to
help the programmer and also help me find the correct programmer.


On Apr 16, 10:52 am, Nick Arnett nick.arn...@gmail.com wrote:
 On Wed, Apr 15, 2009 at 6:30 PM, CWitt wittma...@gmail.com wrote:

  I've looked through the discussion, what I do understand is that it is
  acceptable to store Twitter search results in my own database. What I
  am wondering is how to extract this information and actually store it
  in my database.

 Broad question... what language(s) do you code in?  What databases are you
 familiar with?  What is the web platform you are using?

 Nick