I grabbed this code from the on-line source. Pretty sure it is from Hapi 2.1,
although I think I checked 2.0 and found this routine hadn't had changes.
As to a pluggable version, for me if the proposal that "a single escape
character on it's own does not constitute an escape sequence and should not be
removed from the message" is accepted, I don't need a pluggable version. If
however single escape characters continue to be removed from messages such that
it cannot be determined that this has occurred, then I will need a pluggable
version. The trickiest part from my end is that it's gone, and I can't tell.
Nearly every other case I have encountered where delimiter characters are not
escaped (and I have seen them all), there has been some way to tell (except
maybe the field delimiter, that one sucks to figure out).
In general I would recommend that there be test cases around the place
(including bad behaviours) that decode then encode a message, segment or field,
and compare the input to the output. If they don't match, try to figure out if
it is possible to make them match. Some cases may indicate that a bad sender
can really stuff things up, and there is no fixing it. These cases may be wise
to mention in doco. I have an substantial amount of compensation code for
various fields and data types to try to help with unescaped delimiters,
principally to try to get the field to look like what the data on the senders
application screen looked like, and pass it on correctly escaped. We seem to
have a LOT of badly behaved vendors, and apparently no power to make them fix
it. There are some big names on the list to.
Thanks for your consideration.
Ian
>>> Christian Ohr <christian....@gmail.com> 05/09/13 2:08 >>>
Yes, proper escaping is an endless source of joy ;-) What HAPI version are you
using, btw?
I'll take a look at your code in the next few days if time permits.
When modifying existing functionality, we always face the problem of backwards
compatibility. So for the past one or two releases, we rather added
possibilities to plug in custom strategies of doing things while keeping the
default, rather than changing
existing behavior.
So far, Escape is unfortunately very static, but for 2.2 we can think about
making the escaping strategy pluggable just like other things in HAPI. Thoughts?
cheers
Christian
2013/9/4 Ian Vowles <ian_vow...@health.qld.gov.au>
I have sent mails to the general list about this issue before, and the advice
has helped me progress.
Then along comes another system that has slightly different behaviour.
In this particular case a system correctly escapes the HL7 delimiters EXCEPT
the escape delimiter. This allows it to send field content like this (from an
address):
1 \ 24 Smith \T\ Wesson Road
I was hopeful that since the single escape on it's own didn't form part of an
escape sequence, that it might be preserved through the parse. This is not the
case. The lone backslash
is consumed in the process and disappears. I don't know how valid an argument
it is to say it should be preserved, but if it isn't, I can't subsequently
properly escape it to send to
a downstream system.
Given that I had been dealing with HL7 for some time before I found HAPI, I had
done some work previously on an encode / unencode routine. My own code couldn't
cope with this one
either.
I decided it was time to be brave, and dive into the HAPI code. Somewhere there
had to be encode/unecode low level routines.
Up until I looked in the source, I had been creating a new ST object, and using
it's parse and encode methods. Once I looked into the source I found the Escape
class.
This updated version of Escape does the following:
Preserves escape characters that do not form part of an escape sequence
Permits the exceptional escape sequence case of \X000d\ to work when the escape
character has been changed to something other than \
Adds extra HEX escaped code \X0D\ and \X0A\ because we see them here
occasionally.
Test case code is also included at the bottom, including my now infamous
"HATER" example :-). Test cases with lots of > < are there because we often do
transforms between HL7 and XML, so we often look at these in additional test
cases of the XML output produced.
What are my chances of this being adopted?
If not, how can I get my version to override the existing one?
Thanks
Ian
----------
/**
The contents of this file are subject to the Mozilla Public License Version 1.1
(the "License"); you may not use this file except in compliance with the
License.
You may obtain a copy of the License at http://www.mozilla.org/MPL/
Software distributed under the License is distributed on an "AS IS" basis,
WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for
the
specific language governing rights and limitations under the License.
The Original Code is "Escape.java". Description:
"Handles "escaping" and "unescaping" of text according to the HL7 escape
sequence rules
defined in section 2.10 of the standard (version 2.4)"
The Initial Developer of the Original Code is University Health Network.
Copyright (C)
2001. All Rights Reserved.
Contributor(s): Mark Lee (Skeva Technologies); Elmar Hinz
Alternatively, the contents of this file may be used under the terms of the
GNU General Public License (the ?GPL?), in which case the provisions of the GPL
are
applicable instead of those above. If you wish to allow use of your version of
this
file only under the terms of the GPL and not to allow others to use your
version
of this file under the MPL, indicate your decision by deleting the provisions
above
and replace them with the notice and other provisions required by the GPL
License.
If you do not delete the provisions above, a recipient may use your version of
this file under either the MPL or the GPL.
*/
package ca.uhn.hl7v2.parser;
import java.util.Collections;
import java.util.LinkedHashMap;
import java.util.Map;
/**
* Handles "escaping" and "unescaping" of text according to the HL7 escape
* sequence rules defined in section 2.10 of the standard (version 2.4).
* Currently, escape sequences for multiple character sets are unsupported. The
* highlighting and locally defined escape sequences are also
* unsupported.
* The only hexademical escapes supported are X000d, X0D, X0A
*
* @author Bryan Tripp
* @author Mark Lee (Skeva Technologies)
* @author Elmar Hinz
* @author Christian Ohr
*/
public class HL7Escape {
/** Creates a new instance of Escape */
public Hl7Escape() {
}
/**
* @param text string to be escaped
* @return the escaped string
* <p>Defaults the escape characters to the conventional values |^~\&
*/
public static String escape(String text) {
return escape(text,"|^~\\&");
}
/**
* @param text string to be escaped
* @param encChars encoding characters to be used in the order
* <br>Field, Component, Repetition, Escape, Sub-component
* @return the escaped string
*/
public static String escape(String text, String encChars) {
EncLookup esc = getEscapeSequences(encChars);
int textLength = text.length();
StringBuilder result = new StringBuilder(textLength);
for (int i = 0; i < textLength; i++) {
boolean charReplaced = false;
char c = text.charAt(i);
FORENCCHARS:
for (int j = 0; j < 6; j++) {
if (text.charAt(i) == esc.characters[j]) {
// Formatting escape sequences such as \.br\ should be left alone
if (j == 4) {
if (i+1 < textLength) {
// Check for \.br\
char nextChar = text.charAt(i + 1);
switch (nextChar) {
case '.':
case 'C':
case 'M':
case 'X':
case 'Z':
{
int nextEscapeIndex = text.indexOf(esc.characters[j], i + 1);
if (nextEscapeIndex > 0) {
result.append(text.substring(i, nextEscapeIndex + 1));
charReplaced = true;
i = nextEscapeIndex;
break FORENCCHARS;
}
break;
}
case 'H':
case 'N':
{
if (i+2 < textLength && text.charAt(i+2) == '\\') {
int nextEscapeIndex = i + 2;
if (nextEscapeIndex > 0) {
result.append(text.substring(i, nextEscapeIndex + 1));
charReplaced = true;
i = nextEscapeIndex;
break FORENCCHARS;
}
}
break;
}
}
}
}
result.append(esc.encodings[j]);
charReplaced = true;
break;
}
}
if (!charReplaced) {
result.append(c);
}
}
return result.toString();
}
/**
* @param text string to be unescaped
* @return the unescaped string
* <p>Defaults the escape characters to the conventional values |^~\&
*/
public static String unescape(String text) {
return unescape(text,"|^~\\&");
}
/**
* @param text string to be unescaped
* @param encChars encoding characters to be used in the order
* <br>Field, Component, Repetition, Escape, Sub-component
* @return the unescaped string
*/
public static String unescape(String text, String encChars) {
// If the escape char isn't found, we don't need to look for escape sequences
char escapeChar = encChars.charAt(3);
boolean foundEscapeChar = false;
for (int i = 0; i < text.length(); i++) {
if (text.charAt(i) == escapeChar) {
foundEscapeChar = true;
break;
}
}
if (!foundEscapeChar) {
return text;
}
int textLength = text.length();
StringBuilder result = new StringBuilder(textLength + 20);
EncLookup esc = getEscapeSequences(encChars);
char escape = esc.characters[3];
int encodingsCount = esc.characters.length;
int i = 0;
while (i < textLength) {
char c = text.charAt(i);
if (c != escape) {
result.append(c);
i++;
} else {
boolean foundEncoding = false;
// Test against the standard encodings
for (int j = 0; j < encodingsCount; j++) {
String encoding = esc.encodings[j];
int encodingLength = encoding.length();
if ((i + encodingLength <= textLength) && text.substring(i, i + encodingLength)
.equals(encoding)) {
result.append(esc.characters[j]);
i += encodingLength;
foundEncoding = true;
break;
}
}
if (!foundEncoding) {
// If we haven't found this, there is one more option. Escape sequences of
/.XXXXX/ are
// formatting codes. They should be left intact
if (i + 1 < textLength) {
char nextChar = text.charAt(i + 1);
switch (nextChar) {
case '.':
case 'C':
case 'M':
case 'X':
case 'Z':
{
int closingEscape = text.indexOf(escape, i + 1);
if (closingEscape > 0) {
String substring = text.substring(i, closingEscape + 1);
result.append(substring);
i += substring.length();
} else {
i++;
}
break;
}
case 'H':
case 'N':
{
int closingEscape = text.indexOf(escape, i + 1);
if (closingEscape == i + 2) {
String substring = text.substring(i, closingEscape + 1);
result.append(substring);
i += substring.length();
} else {
i++;
}
break;
}
default:
{
// Preserve unescaped escape delimiter
result.append(c);
i++;
}
}
} else {
// Preserve unescaped escape delimiter
result.append(c);
i++;
}
}
}
}
return result.toString();
}
/**
* Returns a HashTable with escape sequences as keys, and corresponding
* Strings as values.
* @param encChars
* @return
*/
private static EncLookup getEscapeSequences(String encChars) {
EncLookup escapeSequences = new EncLookup(encChars);
return escapeSequences;
}
/**
* A performance-optimized replacement for using when
* mapping from HL7 special characters to their respective
* encodings
*
* @author Christian Ohr
*/
private static class EncLookup {
char[] characters = new char[8];
String[] encodings = new String[8];
EncLookup(String ec) {
characters[0] = ec.charAt(0);
characters[1] = ec.charAt(1);
characters[2] = ec.charAt(2);
characters[3] = ec.charAt(3);
characters[4] = ec.charAt(4);
characters[5] = '\r';
characters[6] = '\r';
characters[7] = '\n';
char escapeChar = ec.charAt(3);
char[] codes = {'F', 'S', 'R', 'E', 'T'};
for (int i = 0; i < codes.length; i++) {
StringBuilder seq = new StringBuilder();
seq.append(escapeChar);
seq.append(codes[i]);
seq.append(escapeChar);
encodings[i] = seq.toString();
}
// encodings[5] = "\\X000d\\";
encodings[5] = escapeChar + "X000d" + escapeChar;
encodings[6] = escapeChar + "X0D" + escapeChar;
encodings[7] = escapeChar + "X0A" + escapeChar;
}
}
}
-----
Test case:
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package ca.uhn.hl7v2.parser;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;
import static org.junit.Assert.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
/**
*
* @author vowlesi
*/
public class SingleBackslashV3Test {
private static final Logger log = LoggerFactory.getLogger(EscapeV2Test.class);
private String encChars = "|^~\\&";
public SingleBackslashV3Test() {
}
@BeforeClass
public static void setUpClass() {
}
@AfterClass
public static void tearDownClass() {
}
@Before
public void setUp() {
}
@After
public void tearDown() {
}
/**
* Test of unescape method, of class Escape.
*/
@Test
public void testUnescapeSingleBackslash() {
log.debug("unescape with single backslash");
String text = "1 \\ 24 Smith \\T\\ Wesson Road";
String expResult = "1 \\ 24 Smith & Wesson Road";
String result = Hl7Escape.unescape(text);
log.debug("Input : " + text);
log.debug("Result : " + result);
log.debug("Expected : " + expResult);
assertEquals(expResult, result);
text = "\\H\\A\\T\\E\\R\\\\N\\<<\\S\\>>\"\\E\\''\\F\\Special test
'\\XFFFFFFFFFFFFFFFFFFFF\\'";
expResult = "\\H\\A&E~\\N\\<<^>>\"\\''|Special test
'\\XFFFFFFFFFFFFFFFFFFFF\\'";
result = Hl7Escape.unescape(text);
log.debug("Input : " + text);
log.debug("Result : " + result);
log.debug("Expected : " + expResult);
assertEquals(expResult, result);
text = "\\H\\A\\T\\E\\R\\\\N\\<<\\S\\>>\"\\E\\''\\F\\Special test '\\X000d\\'";
expResult = "\\H\\A&E~\\N\\<<^>>\"\\''|Special test '\r\'";
result = Hl7Escape.unescape(text);
log.debug("Input : " + text);
log.debug("Result : " + result);
log.debug("Expected : " + expResult);
assertEquals(expResult, result);
text = "\\\\\\\\\\\\\\\\\\\\";
expResult = "\\\\\\\\\\\\\\\\\\\\";
result = Hl7Escape.unescape(text);
log.debug("Input : " + text);
log.debug("Result : " + result);
log.debug("Expected : " + expResult);
assertEquals(expResult, result);
text = "Ken\\n\\F\\edy";
expResult = "Ken\\E\\n\\F\\edy";
result = Hl7Escape.unescape(text);
result = Hl7Escape.escape(result);
log.debug("Input : " + text);
log.debug("Result : " + result);
log.debug("Expected : " + expResult);
assertEquals(expResult, result);
}
}
********************************************************************************
This email, including any attachments sent with it, is confidential and for the
sole use of the intended recipient(s). This confidentiality is not waived or
lost, if you receive it and you are not the intended recipient(s), or if it is
transmitted/received in error.
Any unauthorised use, alteration, disclosure, distribution or review of this
email is strictly prohibited. The information contained in this email,
including any attachment sent with it, may be subject to a statutory duty of
confidentiality if it relates to health service matters.
If you are not the intended recipient(s), or if you have received this email in
error, you are asked to immediately notify the sender by telephone collect on
Australia +61 1800 198 175 ( tel:%2B61%201800%20198%20175 ) or by return email.
You should also delete this email, and any copies, from your computer system
network and destroy any hard copies produced.
If not an intended recipient of this email, you must not copy, distribute or
take any action(s) that relies on it; any form of disclosure, modification,
distribution and/or publication of this email is also prohibited.
Although Queensland Health takes all reasonable steps to ensure this email does
not contain malicious software, Queensland Health does not accept
responsibility for the consequences if any person's computer inadvertently
suffers any disruption to services, loss of information, harm or is infected
with a virus, other malicious computer programme or code that may occur as a
consequence of receiving this email.
Unless stated otherwise, this email represents only the views of the sender and
not the views of the Queensland Government.
**********************************************************************************
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk
_______________________________________________
Hl7api-devel mailing list
Hl7api-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/hl7api-devel
********************************************************************************
This email, including any attachments sent with it, is confidential and for the
sole use of the intended recipient(s). This confidentiality is not waived or
lost, if you receive it and you are not the intended recipient(s), or if it is
transmitted/received in error.
Any unauthorised use, alteration, disclosure, distribution or review of this
email is strictly prohibited. The information contained in this email,
including any attachment sent with it, may be subject to a statutory duty of
confidentiality if it relates to health service matters.
If you are not the intended recipient(s), or if you have received this email in
error, you are asked to immediately notify the sender by telephone collect on
Australia +61 1800 198 175 or by return email. You should also delete this
email, and any copies, from your computer system network and destroy any hard
copies produced.
If not an intended recipient of this email, you must not copy, distribute or
take any action(s) that relies on it; any form of disclosure, modification,
distribution and/or publication of this email is also prohibited.
Although Queensland Health takes all reasonable steps to ensure this email does
not contain malicious software, Queensland Health does not accept
responsibility for the consequences if any person's computer inadvertently
suffers any disruption to services, loss of information, harm or is infected
with a virus, other malicious computer programme or code that may occur as a
consequence of receiving this email.
Unless stated otherwise, this email represents only the views of the sender and
not the views of the Queensland Government.
**********************************************************************************
------------------------------------------------------------------------------
Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more!
Discover the easy way to master current and previous Microsoft technologies
and advance your career. Get an incredible 1,500+ hours of step-by-step
tutorial videos with LearnDevNow. Subscribe today and save!
http://pubads.g.doubleclick.net/gampad/clk?id=58041391&iu=/4140/ostg.clktrk
_______________________________________________
Hl7api-devel mailing list
Hl7api-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/hl7api-devel