Side note, I have found TRUNCATE followed by COPY to insert your data into PostgreSQL is much faster than doing INSERTS of new data.
On Mon, Sep 9, 2024 at 4:54 PM Ben Koenig <[email protected]> wrote: > I should probably give that a try just to see, but I am 9.999% positive > that fedora has tabs in their changelog fields. I don't think a single > character delimiter will work across the board given the variety of the > source data. > > ' | ' was working just fine... but apparently the postgres COPY command > doesn't like multicharacter delimiters. > > Thought I'd toss this out since I just started pulling in all 18 million > updates (again) and now I'm vaguely bored. The previous attempt using > manual INSERT's for each line took 2.5 hours, but I wasn't merging the new > data with the old. > > -Ben > > > On Monday, September 9th, 2024 at 4:30 PM, Russell Senior < > [email protected]> wrote: > > > I like tabs as delimiters, fwiw. > > > > -- > > Russell Senior > > [email protected] > > > > On Mon, Sep 9, 2024 at 4:22 PM Ben Koenig [email protected] > wrote: > > > > > I might bre preaching to the choir here, but it turns out there are > lots of ways to write down an email address. > > > > > > I'm reading data into a database that is stored in "|" delimited > strings - "col1|col2|col3|" etc. This data includes email addresses. Much > to my surprise I ran into errors because someone(who shall rename nameless) > decided to enter their identity under the following format: > > > > > > "firstname lastname" <$USER|at|$HOST> > > > > > > VERTICAL LINES. Why they did this, I don't know. As it turns out, this > particular data set has no defined policy for users entering their email, > since I see the following formats... > > > > > > "firstname lastname" <$USER|at|$HOST> > > > "firstname lastname" <$USER at $HOST> > > > > > > "firstname lastname" <$USER@$HOST> > > > > > > "firstname lastname" > > > > > > I'm sure there are other formats I have yet to observe....... > > > > > > I've seen the human readable "at" in email addresses, but I was NOT > expecting to see someone combine it with vertical lines. But then again, I > suppose it would be unreasonable of me to expect volunteers working for > free to adhere to a strict policy for data entry. > > > > > > -Ben >
