Ah, at first I'm just going to have one big table, with columns for
date, subject (normalized to removed string like RE: and [flexcoders]
I think), and sender (name of sender, not email address). On the first
pass I'm only going to get the date, not the full timestamp of each
message, but I might make a second pass and get the full timestamp
(either a new column or filling out the same date column) and maybe
even the full message body (although that will increase the size of
the DB substantially).

Doug

On Mon, Jun 23, 2008 at 4:05 AM, arieljake <[EMAIL PROTECTED]> wrote:
> Sorry, I meant what would the tables look like (model)?
>
> --- In [email protected], "Doug McCune" <[EMAIL PROTECTED]> wrote:
>>
>> I'm planning on letting people download the sqlite database file.
>> That's a single file (with the .db extension I think) that you can
>> load into any air app and access. I could also do a CSV, but it's a
>> lot of data (hell, excel can't even load that many rows).
>>
>> Doug
>>
>> On Sun, Jun 22, 2008 at 10:18 PM, arieljake <[EMAIL PROTECTED]> wrote:
>> > What will the data format be so we can plan ahead?
>> >
>> > --- In [email protected], "Doug McCune" <doug@> wrote:
>> >>
>> >> Within then next week I hope to have a fairly complete dataset of
>> >> date, subject, and name of who posted for close to all the messages
>> >> ever posted to flexcoders. I might also do a second pass to get full
>> >> text of each message too. I say "close to all" because I'm scraping
>> >> the mail archive website and that only shows 95,000 message, but in
>> >> reality I think there are more like 116,000. Plus some of the
> messages
>> >> from 2004 seems to have gotten a bit jacked on the mail archive site
>> >> and don't show up with the proper subjects and senders (there's a
>> >> block of a few hundred).
>> >>
>> >> But once I get that I'm going to post a sqlite DB file that has
> it all
>> >> that you can load into an air app to play with. I'll let y'all know
>> >> when you can start playing with the data.
>> >>
>> >> Doug
>> >>
>> >> On Sun, Jun 22, 2008 at 5:29 PM, Tim Hoff <TimHoff@> wrote:
>> >> >
>> >> > flexCodersStatus = (mostRecentlyPostedThread == "The subject that
>> > shall
>> >> > not be named!" ? "Banned" : "Active");
>> >> >
>> >> > -TH :-)
>> >> >
>> >> > --- In [email protected], Matt Chotin <mchotin@> wrote:
>> >> >>
>> >> >> Hey folks,
>> >> >>
>> >> >> Given the recent big thread on the subject that shall not be
>> > named, my
>> >> >> colleague Suchit went ahead and got some interesting
> membership stats
>> >> > since
>> >> >> the beginning of the list.
>> >> >>
>> >> >> Number of people who:
>> >> >>
>> >> >> Joined group == 14036
>> >> >> Left group == 3686
>> >> >> Were Banned == 631
>> >> >> Got Removed == 359
>> >> >>
>> >> >> I would say losing ~25% over time isn't too bad. That said we
> haven't
>> >> > done
>> >> >> analysis to see if there's a trend in dates or if we're losing
> more
>> >> > people
>> >> >> recently, etc. But interesting info for starters...
>> >> >>
>> >> >> Matt
>> >> >>
>> >> >
>> >> >
>> >>
>> >
>> >
>>
>
> 

Reply via email to