I should probably give that a try just to see, but I am 9.999% positive that 
fedora has tabs in their changelog fields. I don't think a single character 
delimiter will work across the board given the variety of the source data.

' | ' was working just fine... but apparently the postgres COPY command doesn't 
like multicharacter delimiters.

Thought I'd toss this out since I just started pulling in all 18 million 
updates (again) and now I'm vaguely bored. The previous attempt using manual 
INSERT's for each line took 2.5 hours, but I wasn't merging the new data with 
the old. 
 
-Ben


On Monday, September 9th, 2024 at 4:30 PM, Russell Senior 
<[email protected]> wrote:

> I like tabs as delimiters, fwiw.
> 
> --
> Russell Senior
> [email protected]
> 
> On Mon, Sep 9, 2024 at 4:22 PM Ben Koenig [email protected] wrote:
> 
> > I might bre preaching to the choir here, but it turns out there are lots of 
> > ways to write down an email address.
> > 
> > I'm reading data into a database that is stored in "|" delimited strings - 
> > "col1|col2|col3|" etc. This data includes email addresses. Much to my 
> > surprise I ran into errors because someone(who shall rename nameless) 
> > decided to enter their identity under the following format:
> > 
> > "firstname lastname" <$USER|at|$HOST>
> > 
> > VERTICAL LINES. Why they did this, I don't know. As it turns out, this 
> > particular data set has no defined policy for users entering their email, 
> > since I see the following formats...
> > 
> > "firstname lastname" <$USER|at|$HOST>
> > "firstname lastname" <$USER at $HOST>
> > 
> > "firstname lastname" <$USER@$HOST>
> > 
> > "firstname lastname"
> > 
> > I'm sure there are other formats I have yet to observe.......
> > 
> > I've seen the human readable "at" in email addresses, but I was NOT 
> > expecting to see someone combine it with vertical lines. But then again, I 
> > suppose it would be unreasonable of me to expect volunteers working for 
> > free to adhere to a strict policy for data entry.
> > 
> > -Ben

Reply via email to