Java String is an object and as such a member of type String can be null or allocated in the heap. In C++ std::string is a value and as such a member of type std::string can never be null and is (usually) stack allocated with the contents allocated (usually) on the heap. In C/C++ char* is a decayed char[] and a such a member of type char* can be null or a pointer to an allocated char[] either on the heap or the stack.
As we move away from char* representations of strings which support null to std::string that does not we run into an interesting problem to solve. What to do about null strings in a PDX object? The original PdxReader readString returned a char*, which could be null or allocated a char[] on the heap and returned the pointer (usually, though I found some places where it returned the address to a static constant char[] but without const qualifier, bad). This version of PdxReader::readString then supports null string values coming from Java. Looking at options for converting to std::string we have the following: *1) std::string PdxReader::readString(const std::string& fieldName)* Most natural for C++11 but does not support null. A null PDX string value would be converted into an empty std::string. Upon writing back to PDX the null value would be replaced with and empty PDX string value. This could impact the other domain objects derived from this same PDX type if they treat null and empty strings differently. For example: class Foo { std::string naturalStringMember; ~Foo() = default; void fromData(PdxReader reader) { // RVO moved contents unnaturalStringMember = readString("naturalStringMember"); // null value lost } }; *2) std::string* PdxReader::readString(const std::string& fieldName)* The subtle difference there is returning raw pointer to a heap allocated std::string, which is very very unnatural to C++. The domain object would either hold the raw pointer or check for null and copy to a std::string value member. In either case they would have to free the allocated std::string when done. For example: class Foo { std::string* unnaturalStringMember; std::string naturalStringMember; ~Foo() { if (unnaturalStringMember) { delete unnaturalStringMember; } } void fromData(PdxReader reader) { unnaturalStringMember = readString("unnaturalStringMember"); if (auto tmp = readString("naturalStringMember")) { // move contents naturalStringMember = std::move(*naturalStringMember); // delete delete naturalStringMember; } else { // null value lost } } }; *YUCK!!* *3) std::unique_ptr<std::string> PdxReader::readString(const std::string& fieldName)* This shares some of the similar issues with unnaturally having a heap allocated std::string but solves the ownership ambiguity and deleting issues. For example: class Foo { std::unique_ptr<std::string> unnaturalStringMember; std::string naturalStringMember; ~Foo() = default; void fromData(PdxReader reader) { unnaturalStringMember = readString("unnaturalStringMember"); if (auto tmp = readString("naturalStringMember")) { // move contents and delete naturalStringMember = std::move(*tmp); } else { // null value lost } } }; *Eh!* 4) Given that C++ does not allow the same method signature only differing in return type we can't have all these methods on the interface unless we named them differently as well. For example readString and readNullableString. Providing both puts the power in the implementor but maybe we should have an opinionated approach. std::string PdxReader::readString(const std::string& fieldName); std::unique_ptr<std::string> PdxReader::readNullableString(const std::string& fieldName); class Foo { std::unique_ptr<std::string> nullableStringMember; std::string naturalStringMember; ~Foo() = default; void fromData(PdxReader reader) { nullableStringMember = readNullabletring("nullableStringMember"); naturalStringMember = readString("uniqueStringMember"); } }; 5) Alternatively one could use out parameters, though we have been trying to get away from out parameters since they don't match the "functional" preference. void PdxReader::readString(const std::string& fieldName std::string& value); void PdxReader::readString(const std::string& fieldName, std::string* value); void PdxReader::readString(const std::string& fieldName, std::unique_ptr<std::string>& value); class Foo { std::string* rawStringMember; std::unique_ptr<std::string> uniqueStringMember; std::string naturalStringMember; ~Foo() = default; void fromData(PdxReader reader) { readString("rawStringMember", rawStringMember); readString("uniqueStringMember", uniqueStringMember); readString("naturalStringMember", naturalStringMember); } }; Does supporting null really buy us anything worth the pain an anguish of not looking very C++? My preference is probably towards non-null by providing a "nullable" version of the method for backwards compatibility (option 4). Thoughts? What about arrays of strings? There were supported as a decayed char[][] as char**. Defined as char** PdxReader::readStringArray(const std::string& fieldName, int32_t& len). Where len was an out param indicating the size of the array. The function could return null since Java arrays are objects and therefor can be null. Each element of the array could also be null since String in Java is nullable. To cleanup one had to iterate the array and delete each element if not null then delete the array. Should we consider putting this into a std::vector<std::string>. Again we lose the null support but maybe it is worth it. Alternatively we could support a nullable version as std::unique_ptr<vector<std::unique_ptr<std::string>>>. More thoughts? -Jake