(Message forwarded see below for original response)

Hi John,

please excuse sending this mail in private. The mail account I'm using
for the lists is currently blocked for sending, but that has to wait
till next year. ;)

See my explanation in the following and feel free to forward this mail
to the u...@dba mailing list.

Okay, so now my desk is clear and I wish you and anyone reading this a
happy new year,
Marc


Am Montag, den 29.12.2008, 18:10 -0500 schrieb John Toliver:
> Marc I have been staring at this code you wrote for the past two days and
> I'm not able to make sense of it.  I need a little more help.  Now I
> understand that the two lines are dependent on how the data in my fields
is
> structured.

Yes.

> It's simple text, like any normally formatted paragraph.  Is
> their something I can do to the DB to analyze it to figure out?  With that
> question asked see below:

The question making the difference between the two statements is:

Does your data have multiple paragraphs in one entry or is it only one
paragraph (maybe formed by multiple sentences) in any case?

If you have line endings inbetween the pragraph that are welcome, the
second statement I wrote would be necessary. If you have only one
paragraph for sure and want to get rid of any additional paragraph at
the end, the first statement should do.

This makes up the first part of the problem statement.

> > If Johns data does not have sentences or parts separated by NL or CR
> > chars he could use sth. like:
> >
> > LEFT(TheString, POSITION ('.'+CHAR(10) IN TheString)+1)
>
>
> JT - I don't think the above line would work then because I have many with
> paragraphs separated by either a line feed (is that char10?) or a carriage
> return (is this char13).  I'm not sure how to tell which is which as I
> haven't manually written all the data in each field.  Some of it has been
> pasted from the web or other sources.  Hence my question above about
> analyzing the data.

I'm not sure either what type of line endings HSQL uses and if using
another OS makes a difference. E.g. if you work on windows storing CR+NL
and want to retrieve on Lunix expecting NL only there is a problem. The
real question here is how OOo handles it, because HSQL stores what is is
given, I assume.

- Windows/DOS has both, NL and CR for ending a parapgraph.

- Lunix/FreeBSD have only NL.

- McIntosh had only CR in the past, this may have changed because
nowadays on OS X it uses a FreeBSD userland (=programs under the hood).

So part two of the solution has to be to find out what type of paragraph
endings are stored in your database.

> > If there are concatenated sentences this would only work for double
> > newlines, leaving on extra NL:
> >
> > LEFT(TheString, POSITION ('.'+CHAR(10)+CHAR(10) IN TheString)+1)
>
>
> I don't think I have concatenated sentences.  The text would read as
normal
> information.
>
>
> > > Can the REPLACE function take a concatenated string like
> > > CHAR(13)+CHAR(10), for example? Trimming the individual characters
> > > wouldn't be enough if they're interspersed with other "whitespace"
> > > characters.
> >
>
> JT - I went to the site suggested by Barbara, (
> http://hsqldb.org/doc/guide/ch09.html) and found this:
> "REPLACE(s,replace,s2)
>         replaces all occurrences of replace in s with s2"  Which I think
> means if I were to open an SQL window for this, it would replace all
> instances of the phrase "replace" in the table named "s" with the new
phrase
> "s2"?  If so then can I use this on an entire table?  I'm not clear on the
> implementation of the command, incidentally the same question I have about
> the code you originally mentioned.  Would I place the variation of this
text
> in the original RTRIM string we looked at?

>From the data you posted as an example (wich I found too late ;) I
expect for the paragraph structure it has to be the first statement I
wrote.

Your data example again:

[Contents of a field in the record "Major/Common Side Effects" before
and
after running the command we've discussed]

"CNS: confusion, sedation, dysphoria, euphoria, floating feeling,
hallucinations, headache, unusual dreams,
EENT: blurred vision, diplopia, miosis,
Resp: respiratory depression,
CV: hypotension, bradycardia,
GI: constipation, nausea, vomiting,
GU: urinary retention,
Derm: sweating,
Misc: physical dependence, psychological dependence, tolerance,
CR
CR
CR
CR
..."

I see no paragraph endings inbetween. As a start this would be:

 LEFT(TheString, POSITION ('.'+CHAR(10) IN TheString)-1)

What does it do? (-> look at the HSQL docs, too!)

 LEFT(string, position)

gives back the left side part of the string up to the given numerical
position.

 LEFT("John", 2)

returns "Jo".

Now what is missing is the position where to stop getting the left part
of your data. You want anything up to the first CR or NL. I assume
including one CR or NL here.

 POSITION( searchstring IN datastring )

returns the starting point of the search string inside the data string.

 POSITION("o", "John")

returns 2, because the o is the second char inside the data string.

Okay, adapting the idea to your sample data would give:

 LEFT(JohnsString, POSITION (CHAR(13) IN JohnsString)+1)

The position is calculated as the end of your data plus one CR. If you
do not want the CR leave out the +1 and the string you get is one char
shorter.

In all this you may have to replace CR by NL.

Phew,
Marc

--
Marc Santhoff <[email protected]>




-- 
I've discovered the key to success is to never give up.  You either learn
the right way, or you run out of ways to do it wrong.  A win/win situation!

Reply via email to