Mahmood Naderan via Python-list wrote: > I wrote this: > > a = np.zeros((p.max_row, p.max_column), dtype=object) > for y, row in enumerate(p.rows): > for cell in row: > print (cell.value) > a[y] = cell.value
In the line above you overwrite the row in the numpy array with the cell value. In combination with numpy's "broadcasting" you end up with all values in a row set to the rightmost cell in the spreadsheet row, just like in >>> import numpy >>> a = numpy.array([[0, 0, 0]]) >>> a array([[0, 0, 0]]) >>> for x in 1, 2, 3: ... a[0] = x ... >>> a array([[3, 3, 3]]) The correct code: for y, row in enumerate(ws.rows): a[y] = [cell.value for cell in row] I think I posted it before ;) > print (a[y]) > > > For one of the cells, I see > > NM_198576.3 > ['NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' > 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' > 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' > 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' > 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' > 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' > 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' > 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' > 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' > 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'] > > > These are 50 NM_198576.3 in a[y] and 50 is the number of columns in my > excel file (p.max_column) > > > > The excel file looks like > > CHR1 11,202,100 NM_198576.3 PASS 3.08932 G|B|C - > . . . > > > > Note that in each row, some cells are '-' or '.' only. I want to read all > cells as string. Then I will write the matrix in a file and my main code > (java) will process that. I chose openpyxl for reading excel files, > because Apache POI (a java package for manipulating excel files) consumes > huge memory even for medium files. > > So my python script only transforms an xlsx file to a txt file keeping the > cell positions and formats. What kind of text file? > Any suggestion? In that case there's no need to load the data into memory. For example, to convert xlsx to csv: #!/usr/bin/env python3 from openpyxl import load_workbook import csv source = "beta.xlsx" dest = "gamma.csv" sheet = 'alpha' wb = load_workbook(filename=source, read_only=True) ws = wb[sheet] with open(dest, "w") as outstream: csv.writer(outstream).writerows( [cell.value for cell in row] for row in ws.rows ) -- https://mail.python.org/mailman/listinfo/python-list