> How would you go about formatting text pasted from a word document into
> a coldfusion web form?

This is what I do. But it requires some jar files to work to the best of its
capabilities.

1. Install the jTidy jars. And register cfx_markdown as a java CFX tag.
2. Check to see if the posted data is HTML or not.
3. If the posted data is *not* HTML, use some RegEx and cfx_markdown to
create some HTML from the posted data.
4. Clean out junk HTML markup that a Word formatted submission would
include.
5. Parse the HTML with jTidy to make the HTML valid.
6. Run some funky functions to create ordered and unordered lists where
appropriate.
7. Re-parse the HTML using jTidy again to ensure that the HTML is still
valid.
8. Fix the appearance of ampersands though out the text.
9. Return valid, cleanly formatted HTML.

> Also should you format the data before it is inputed into the database
> or would you format the data when it is displayed?

On the Store the formatted markup rather than the pasted junk as you only
need to parse it once. If you store it as posted and re-parse every time you
pull it out, you're putting extra load on the server. If you're paranoid
about maybe the parser getting things wrong, store the posted and fixed
content together and when you introduce improvements to your Word parsing
algorithms, you can re-parse the old content too.

I have source code for this but it's quite a chunk of code and is spread
around a few CFC's I could probably put all the functions into a file for
you to give you some inspiration and you might be able to re-factor them for
your purposes as it's really not an easy thing to do and get right.

One day I may just make a "Word Cleaner" CFC that does this all in one place
:-).

Paul



~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Want to reach the ColdFusion community with something they want? Let them know 
on the House of Fusion mailing lists
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:328009
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4

Reply via email to