Background: I am building an application that is going to replace a
client-server document publication app.  The old app was custom
written and used MS Word as the editor to save and store content for
the documents that needed to be published.  It was poorly implemented,
but it worked well enough.

In rebuilding the app, I'm importing the content from those MS Word
docs usind a cfx tag that's converting the word doc into straight
text, removing all formatting and other junk but leaving all text and
whitespace characters.  For a couple of reasons (first, the original
database only has basic ODBC drivers and I couldn't get it to work
with MX, second the huge perfromance hit when using COM with CFMX --
importing 14k docs took 8+ hours on MX and less than 90 minutes with
CF5) I am using ColdFusion 5 on a dev server running Win2K Server. 
Many of these docs have tabs in  them for formatting basic columnar
data; the tab does not, of course, have any real meaning in the web
world (and browsers collapse whitespace), so the result is that all of
the columnar data is out of whack.

My first thought it to use rereplace to find tab chars and replace
them with NBSP's.  I am doing the following:

cfset tmp.content = rereplace(tmp.content, "[\t]+",
"    ", "ALL")

Which I thought would replace tabs, but it's instead replacing all of
the "t" characters.  What's wrong with my regex code?

Next, I would actually prefer to somehow parse out these docs and
instead of using non-breaking spaces, I'd like to wrap that columnar
data in a table.  Has anyone done this before, and if so, what was
your approach?  String parsing isn't my strong suit.

If necessary, I COULD do a basic Word scrape in CF5 but then do a more
advanced content manipulation using CFMX7.

ANY help would be appreciated.  I'd appreciate a free solution, but if
a good pay solution exists, let me know.

Thanks,

Pete

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Logware (www.logware.us): a new and convenient web-based time tracking 
application. Start tracking and documenting hours spent on a project or with a 
client with Logware today. Try it for free with a 15 day trial account.
http://www.houseoffusion.com/banners/view.cfm?bannerid=67

Message: http://www.houseoffusion.com/lists.cfm/link=i:4:218494
Archives: http://www.houseoffusion.com/cf_lists/threads.cfm/4
Subscription: http://www.houseoffusion.com/lists.cfm/link=s:4
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4
Donations & Support: http://www.houseoffusion.com/tiny.cfm/54

Reply via email to