Not really, the purpose of getting the internal_hex in most tests is to really know the internals of the pdf_text_t, so it should be retrieved as UTF-32HE, not as UTF-8.
There is a way to test the desired functionality without accessing the internal fields of pdf_text_t? If we get the internal string as UTF-8 using the public API and it is correct, it would mean that the internal representation (being UTF032HE or anything else) should be ok. Right?
