Hey there, There's usually a control character or something that causes this behavior.
I don't see the source data hexdump attachment. Could you please reattach? -Abe On Wed, Oct 8, 2014 at 5:29 AM, shakun grover <[email protected]> wrote: > Even when I view view this data in Hive, it takes > 1,'James','A','Bond','557502533','(681) 675-8580','[email protected]',' > www.google.com','110 Campus Dr. Berkeley CA 94111 > as first column of first row > then > ','San Jose','CA','94500','USA' as first column of second row.. Other > columns are given value as NULL > > On Wed, Oct 8, 2014 at 4:38 PM, shakun grover <[email protected]> wrote: > > > Hi Abe, > > > > I have attached the sample data with this mail. > > > > This is the job that I created to import this data to HDFS: > > > > *Job:* > > Name: testEmp > > > > Database configuration > > > > Schema name: > > Table name: > > Table SQL statement: select * from test.emp WHERE ${CONDITIONS} > > Table column names: > > Partition column name: id > > Nulls in partition column: > > Boundary query: > > > > Output configuration > > > > Storage type: > > 0 : HDFS > > Choose: > > Output format: > > 0 : TEXT_FILE > > 1 : SEQUENCE_FILE > > Choose: 0 > > Output directory: /tmp/emp/1 > > > > Throttling resources > > > > Extractors: > > Loaders: > > Job was successfully updated with status FINE > > > > When I view the data with the below mentioned command: > > *hadoop fs -cat /tmp/emp/p** > > *It shows me data as follows:(*It inserts line break after 110 Campus Dr. > > Berkeley CA 94111) > > > > 1,'James','A','Bond','557502533','(681) 675-8580','[email protected]',' > > www.google.com','110 Campus Dr. Berkeley CA 94111 > > ','San Jose','CA','94500','USA' > > 2,'James','A','Bond','557502533','(681) 675-8580','[email protected]',' > > www.google.com','110 Campus Dr. Berkeley CA 94111 > > ','San Jose','CA','94500','USA' > > 3,'James','A','Bond','557502533','(681) 675-8580','[email protected]',' > > www.google.com','110 Campus Dr. Berkeley CA 94111 > > ','San Jose','CA','94500','USA' > > 4,'James','A','Bond','557502533','(681) 675-8580','[email protected]',' > > www.google.com','110 Campus Dr. Berkeley CA 94111 > > ','San Jose','CA','94500','USA' > > 5,'James','A','Bond','557502533','(681) 675-8580','[email protected]',' > > www.google.com','110 Campus Dr. Berkeley CA 94111 > > ','San Jose','CA','94500','USA' > > > > > > > > > > On Wed, Oct 8, 2014 at 1:03 AM, Abraham Elmahrek <[email protected]> > wrote: > > > >> Could we take a peek at your data from its source as hex? > >> > >> -Abe > >> > >> On Tue, Oct 7, 2014 at 3:46 AM, shakun grover <[email protected]> > wrote: > >> > >> > Yes, that's correct that Sqoop2 should insert new lines at the end of > a > >> > records. > >> > But if that record has many columns say (>15) columns in a record, > then > >> > after few columns, it inserts a new line . > >> > > >> > Example: > >> > 1,'346088103340400','3410 9240 5550 > >> > 778','3710-1690-2390-472','537436268','537 43 6268 > >> > > >> > ','537-43-6268 > >> > > >> > ','6816758580 > >> > > >> > ','681 675 8580 > >> > > >> > ','681-675-8580 > >> > > >> > ','(681) 675-8580 > >> > > >> > ','(681)675-8580 > >> > > >> > ','1617547959','12.215.42.19 > >> > > >> > ','','1132286141 > >> > > >> > ','https://blu162.mail.live.com > >> > > >> > ','110 Campus Dr. Berkeley CA 94111 > >> > > >> > ','James > >> > > >> > ' > >> > This is one record which got imported to HDFS in the above mentioned > >> > format. After 6th column it inserted a new line and then after each > >> column, > >> > it inserted new line. Though this behavior of inserting new lines is > >> not > >> > same in all the cases. > >> > It inserts new lines randomly after nth column. > >> > > >> > > >> > On Thu, Oct 2, 2014 at 1:12 AM, Abraham Elmahrek <[email protected]> > >> wrote: > >> > > >> > > Hey there, > >> > > > >> > > Sqoop2 should insert new lines at the end of a record. In fact, > Sqoop2 > >> > > should just write CSV. Could you copy/paste an example with Schema? > >> > > > >> > > -Abe > >> > > > >> > > On Tue, Sep 30, 2014 at 11:32 PM, shakun grover <[email protected] > > > >> > > wrote: > >> > > > >> > > > Hi All, > >> > > > > >> > > > When I import many columns(say >20 columns) from RDBMS to HDFS, > then > >> > > Sqoop2 > >> > > > inserts a new line in the output file.The newline appears at the > >> end of > >> > > > certain fields.Doesn't seem to appear for every single field. > >> > > > > >> > > > Can you please tell me why this new line is inserted? And is there > >> any > >> > > way > >> > > > to avoid this? > >> > > > > >> > > > Thanks in advance!! > >> > > > > >> > > > > >> > > > -- > >> > > > Thanks & Regards, > >> > > > Shakun Grover > >> > > > > >> > > > >> > > >> > > >> > > >> > -- > >> > Thanks & Regards, > >> > Shakun Grover > >> > > >> > > > > > > > > -- > > Thanks & Regards, > > Shakun Grover > > > > > > -- > Thanks & Regards, > Shakun Grover >
