I have the following validator, which I am switching over to tidy from
beautiful soup because of some issues beautiful soup had with entities,

# custom validator for regular page content
class validPageContent( v.FancyValidator ):
    tidy_options = dict( output_xhtml=1, add_xml_decl=0, doctype='omit',
show_warnings=0, show_body_only=1, tidy_mark=0, char_encoding='utf8' )
    
    # message the validator can send, must be named messages
    messages = {  'bad_html': 'There are some errors in your html.', }
    
    # this does the to python conversion
    def _to_python( self, value, state ):
        # use tidy to clean up the html, and possibly get errors
        self.tidy_html = tidy.parseString( value, **self.tidy_options )
        return str(self.tidy_html)
   
    # validate_python runs AFTER the _to_python method
    def validate_python( self, value, state ):
        # if tidy produced errors, bad  html
        if self.tidy_html.errors:
            log.info( "\n\nErrors: %s\n\n" % self.tidy_html.errors )
            raise v.Invalid( self.message('bad_html', state), value,
                  state )


The process seems to go ok, but only 1 character of my input gets
through. This wasn't happening with beautiful soup so I am at a loss as
to what could be going on. The validator runs after a formencode
UnicodeString validator in a validators.All() pair. If anyone can shed
some light on this it would be most appreciated.

Thanks
Iain



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"TurboGears" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/turbogears?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to