On Apr 14, 2011, at 5:41 AM, Ralph Shnelvar wrote:

We have twenty-or-so MS Word 2000 documents that we want to display on
our website.

What we did was convert the MS Word documents to Compact HTML. We then
display a document via an
 <object data="/doc/somedoc.htm" height='100%' id='xyz' width='100%'>

This all works great except for a bit of a fly in the ointment.

Doing an SEO (Search Engine Optimization) analysis shows that a least
one analyzer does not analyze the contents of "/doc/somedoc.htm".

I guess it is reasonable not to count the contents of the document
pointed to because it might not even be owned by the displaying page.

- - - -

But these MS Word 2000 are _ours_.  So does anyone know of a way to
automatically convert the htm file produced so that I can render it
rather than refer to the document via object/data?

If these documents are all alike in internal structure, you could write a little script using Nokogiri to capture only the id="whatever" node containing the page content, and then write that back out as a sort of partial. OR suck it into an ActiveRecord object and persist it in your database.

require 'rubygems'
require 'nokogiri'
require 'fileutils'

fp = '/path/to/your/file'
#if your starting document is well-formed
doc = Nokogiri::XML(File.read(fp))
#otherwise
#doc = Nokogiri::HTML(File.read(fp))
div = doc.at_css('#someDiv')
output = div.to_xhtml

Walter


--
Posted via http://www.ruby-forum.com/.

--
You received this message because you are subscribed to the Google Groups "Ruby on Rails: Talk" group. To post to this group, send email to rubyonrails- [email protected]. To unsubscribe from this group, send email to [email protected] . For more options, visit this group at http://groups.google.com/group/rubyonrails-talk?hl=en .


--
You received this message because you are subscribed to the Google Groups "Ruby on 
Rails: Talk" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/rubyonrails-talk?hl=en.

Reply via email to